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PREFACE. 



There are two approaches to stating the linear algebra and the multidimensional 
geometry. The first approach can be characterized as the «coordinates and 
matrices approach*. The second one is the «invariant geometric approach*. 

In most of textbooks the coordinates and matrices approach is used. It starts 
with considering the systems of linear algebraic equations. Then the theory of 
determinants is developed, the matrix algebra and the geometry of the space R™ 
are considered. This approach is convenient for initial introduction to the subject 
since it is based on very simple concepts: the numbers, the sets of numbers, the 
numeric matrices, linear functions, and linear equations. The proofs within this 
approach are conceptually simple and mostly are based on calculations. However, 
in further statement of the subject the coordinates and matrices approach is not so 
advantageous. Computational proofs become huge, while the intension to consider 
only numeric objects prevents us from introducing and using new concepts. 

The invariant geometric approach, which is used in this book, starts with the 
definition of abstract linear vector space. Thereby the coordinate representation 
of vectors is not of crucial importance; the set-theoretic methods commonly used 
in modern algebra become more important. Linear vector space is the very object 
to which these methods apply in a most simple and effective way: proofs of many 
facts can be shortened and made more elegant. 

The invariant geometric approach lets the reader to get prepared to the study 
of more advanced branches of mathematics such as differential geometry, commu- 
tative algebra, algebraic geometry, and algebraic topology. I prefer a self-sufficient 
way of explanation. The reader is assumed to have only minimal preliminary 
knowledge in matrix algebra and in theory of determinants. This material is 
usually given in courses of general algebra and analytic geometry. 

Under the term «numcric field» in this book we assume one of the following 
three fields: the field of rational numbers Q, the field of real numbers M, or the 
field of complex numbers C. Therefore the reader should not know the general 
theory of numeric fields. 

I am grateful to E. B. Rudcnko for reading and correcting the manuscript of 
Russian edition of this book. 



May, 1996; 
May, 2004. 



R. A. Sharipov. 
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LINEAR VECTOR SPACES AND LINEAR MAPPINGS. 



§ 1. The sets and mappings. 

The concept of a set is a basic concept of modern mathematics. It denotes any 
group of objects for some reasons distinguished from other objects and grouped 
together. Objects constituting a given set are called the elements of this set. We 
usually assign some literal names (identificators) to the sets and to their elements. 
Suppose the set A consists of three objects to, n, and q. Then we write 

A = {to, n, q}. 

The fact that to is an element of the set A is denoted by the membership sign: 
to e A. The writing p £ A means that the object p is not an element of the set A. 

If we have several sets, we can gather all of their elements into one set which 
is called the union of initial sets. In order to denote this gathering operation we 
use the union sign U. If we gather the elements each of which belongs to all of our 
sets, they constitute a new set which is called the intersection of initial sets. In 
order to denote this operation we use the intersection sign n. 

If a set A is a part of another set B, we denote this fact as A C B or A C B 
and say that the set A is a subset of the set B. Two signs C and C are equivalent. 
However, using the sign C, we emphasize that the condition A C B does not 
exclude the coincidence of sets A = B. If A C. B, then wc say that the set A is a 
strict subset in the set B. 

The term empty set is used to denote the set that comprises no elements at 
all. The empty set is assumed to be a part of any set: 0cA 

Definition 1.1. The mapping / : X — > Y from the set X to the set Y is a 
rule / applicable to any clement x of the set X and such that, being applied to a 
particular element x G X, uniquely defines some element y — f(x) in the set Y. 

The set X in the definition 1.1 is called the domain of the mapping /. The 
set Y in the definition 1.1 is called the domain of values of the mapping /. The 
writing f(x) means that the rule / is applied to the element x of the set X. The 
element y = f(x) obtained as a result of applying / to x is called the image of x 
under the mapping /. 

Let A be a subset of the set X. The set f(A) composed by the images of all 
elements x € A is called the image of the subset A under the mapping /: 

f(A) = {y e Y: 3x ((x e A) k (f(x) = j,))}. 

If A = X, then the image f(X) is called the image of the mapping f. There is 
special notation for this image: f(X) = Im/. The set of values is another term 
used for denoting Im / = f(X); don't confuse it with the domain of values. 
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Let y be an element of the set Y. Let's consider the set f^ 1 (y) consisting of all 
elements x e X that are mapped to the element y. This set is called the 

total preimage of the element y: 

f- 1 {y) = {xeX: f(x)=y}. 

Suppose that B is a subset in Y. Taking the union of total preimages for all 
elements of the set B, we get the total preimage of the set B itself: 

f- 1 (B) = {xeX: f(x)€B}. 

It is clear that for the case B = Y the total preimage / _1 (Y) coincides with X. 
Therefore there is no special sign for denoting / (Y). 

Definition 1.2. The mapping /: X — > Y is called injective if images of any 
two distinct elements x\ ^ x 2 are different, i.e. x\ ^ x 2 implies f{x{) ^ f(x 2 )- 

Definition 1.3. The mapping /: X — > Y is called surjective if total preimage 
f~ x {y) of any element y e Y is not empty. 

Definition 1.4. The mapping / : X —* Y is called a bijective mapping or 
a one-to-one mapping if total preimage f^ 1 (y) of any element y € Y is a set 
consisting of exactly one element. 

Theorem 1.1. The mapping f: X — > Y is bijective if and only if it is injective 
and surjective simultaneously. 

PROOF. According to the statement of theorem 1.1, simultaneous injectivity 
and surjectivity is necessary and sufficient condition for bijectivity of the mapping 
/: X — > Y. Let's prove the necessity of this condition for the beginning. 

Suppose that the mapping / : X — > Y is bijective. Then for any y € Y the total 
preimage f~ 1 (y) consists of exactly one element. This means that it is not empty. 
This fact proves the surjectivity of the mapping / : X —* Y. 

However, we need to prove that / is not only surjective, but bijective as well. 
Let's prove the bijectivity of / by contradiction. If the mapping / is not bijective, 
then there are two distinct elements x\ ^ x 2 in X such that f{x\) — f(x 2 ). Let's 
denote y = f{x\) — f(x 2 ) and consider the total preimage f^ 1 (y)- From the 
equality f(x±) = y we derive x\ G j (y). Similarly from ,f(x 2 ) — y we derive 
x 2 e f~ 1 (y)- Hence, the total preimage ,f^ 1 (y) is a set containing at least two 
distinct elements x\ and x 2 . This fact contradicts the bijectivity of the mapping 
/ : X — > Y. Due to this contradiction we conclude that / is surjective and 
injective simultaneously Thus, we have proved the necessity of the condition 
stated in theorem 1.1. 

Let's proceed to the proof of sufficiency. Suppose that the mapping / : X — > Y 
is injective and surjective simultaneously. Due to the surjectivity the sets f~ 1 (y) 
are non-empty for all y e Y. Suppose that someone of them contains more 
than one element. If x\ ^ x 2 are two distinct elements of the set f~ 1 (y), then 
f{x\) = y = f(x 2 ). However, this equality contradicts the injectivity of the 
mapping /: X — > Y. Hence, each set f^ 1 (y) is non-empty and contains exactly 
one element. Thus, we have proved the bijectivity of the mapping /. □ 
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Theorem 1.2. The mapping f: X — > Y is surjective if and only if lmf = Y. 

Proof. If the mapping / : X — > Y is surjective, then for any element y E Y 
the total preimage f~ 1 (y) is not empty. Choosing some element x € f^ 1 (y), we 
get y = /(x). Hence, each element y £ Y is an image of some element x under the 
mapping /. This proves the equality Im/ = Y . 

Conversely, if Im / = Y , then any element y € V is an image of some element 
x E X, i. e. y = f(x). Hence, for any y S Y the total preimage f^ 1 (y) is not 
empty. This means that / is a surjective mapping. □ 

Let's consider two mappings /: X — > Y and g: Y — > Z. Choosing an arbitrary 
element xelwe can apply / to it. As a result we get the element f(x) € Y. 
Then we can apply </ to /(x). The successive application of two mappings g(f(x)) 
yields a rule that associates each element x e X with some uniquely determined 
element z = g(f(x)) e Z, i. e. we have a mapping <p: X — > Z. This mapping is 
called iAe composition of two mappings / and 5. It is denoted as (p — g<> f. 

Theorem 1.3. The composition go f of two injective mappings f : X — > Y and 
5 : y — > Z is an injective mapping. 

Proof. Let's consider two elements x\ and X2 of the set X. Denote 2/1 = f(x\) 
and 2/2 = f(%2)- Therefore g°f{x\) — .9(2/1) and g°f(x2) — .9(2/2)- Due to the 
injectivity of / from x\ 7^ X2 we derive 2/1 7^ 2/2- Then due to the injectivity of g 
from 2/1 ^ y 2 we derive 3(2/1) ^ 27(2/2)- Hence, g°f{x\) ^ g° f{x 2 ). The injectivity 
of the composition g ° / is proved. □ 

Theorem 1.4. The composition go f of two surjective mappings f : X — > T 
and 27 : Y — > Z is a surjective mapping. 

Proof. Let's take an arbitrary element z e Z. Due to the surjectivity of 
2? the total preimage g~ 1 (z) is not empty. Let's choose some arbitrary vector 
2/ € S -1 ^) and consider its total preimage f~ 1 (y). Due to the surjectivity 
of / it is not empty. Then choosing an arbitrary vector x G we get 

g°f(x) = g(f(x)) = g(y) = z. This means that x € (g <> f)~ 1 (z). Hence, the total 
preimage {g°f) {z) is not empty. The surjectivity of g° f is proved. □ 

As an immediate consequence of the above two theorems we obtain the following 
theorem on composition of two bijections. 

Theorem 1.5. The composition go f of two bijective mappings f : X — > Y and 
g: Y — > Z is a bijective mapping. 

Let's consider three mappings /: X — > Y, g: Y — > Z, and h: Z — ► U. Then we 
can form two different compositions of these mappings: 

<p = ho(g°f), ip = (h°g)of. (1.1) 

The fact of coincidence of these two mappings is formulated as the following 
theorem on associativity. 

Theorem 1.6. The operation of composition for the mappings is an associative 
operation, i. e. = (h<>g)o f. 
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Proof. According to the definition 1.1, the coincidence of two mappings 
ip: A — > U and ip: A — > £/ is verified by verifying the equality <£>(a;) = ip(x) for an 
arbitrary element x € A. Let's denote a = h°g and (3 = g° f. Then 



Comparing right hand sides of the equalities (1.2), we derive the required equality 
<p(x) — tp(x) for the mappings (1.1). Hence, h°(g°f) = {h°g)°f. □ 

Let's consider a mapping / : X — > F and the pair of identical mappings 
id x : X — > X and idy : Y — > F . The last two mappings are defined as follows: 



Definition 1.5. A mapping Z: F — ► A is called Ze/t inverse to the mapping 
/: A -> y if Z°/ = id x . 

Definition 1.6. A mapping r: y — > A is called right inverse to the mapping 
/: A^y if for = id y . 

The problem of existence of the left and right inverse mappings is solved by the 
following two theorems. 

Theorem 1.7. A mapping / : A — > y possesses the left inverse mapping I if 
and only if it is injective. 

Theorem 1.8. A mapping f: X — > y possesses the right inverse mapping r if 
and only if it is surjective. 

Proof of the theorem 1.7. Suppose that the mapping / possesses the left 
inverse mapping Z. Let's choose two vectors x\ and x 2 in the space A and let's 
denote y\ = f{x\) and y 2 = f(x 2 )- The equality l°f = \&x yields x\ = l(yi) 
and x 2 — i (3/2 ) - Hence, the equality y\ = y 2 implies x\ = x 2 and x\ ^ x 2 implies 
Vi 7^ J/2- Thus, assuming the existence of left inverse mapping I, we defivc that the 
direct mapping / is injective. 

Conversely, suppose that / is an injective mapping. First of all let's choose 
and fix some element xo € A. Then let's consider an arbitrary element y € Im/. 
Its total preimage f~ 1 (y) is not empty. For any y e Im/ we can choose and fix 
some element x y e f~ 1 (y) in non-empty set f^ 1 (y)- Then we define the mapping 
I: Y — > A by the following equality: 



Let's study the composition lof. It is easy to see that for any x G A and for 
y = f(x) the equality l°f(x) = x y is fulfilled. Then f(x y ) = y = f(x). Taking into 
account the injectivity of /, we get x y — x. Hence, lof(x) = x for any x e A. 
The equality 1° f = idx for the mapping I is proved. Therefore, this mapping is a 
required left inverse mapping for /. Theorem is proved. □ 

Proof of the theorem 1.8. Suppose that the mapping / possesses the right 
inverse mapping r. For an arbitrary element i/eT, from the equality / ° r = idy 



<p{x) = ho(3(x) = h(P(x)) = h(g(f(x))), 
1 p(x) = aof(x) = a(f(x)) = h(g(f(x))). 



(1.2) 



idx(x) = x, 



idy(y) = y. 




for y E Im /, 
for y Im /. 
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we derive y = f{r(y)). This means that r(y) G f~ x (y), therefore, the total 
preimage is not empty. Thus, the surjectivity of / is proved. 

Now, conversely, let's assume that / is surjective. Then for any y E Y the 
total preimage f~ 1 (y) is not empty. In each non-empty set f~ 1 (y) we choose and 
mark exactly one clement x y G f^ 1 (y). Then we can define a mapping by setting 
r{y) = x y . Since f(x y ) = y, we get f{r(y)) = y and / »r = idy. The existence of 
the right inverse mapping r for / is established. □ 

Note that the mappings I : Y — > X and r : Y — > X constructed when proving 
theorems 1.7 and 1.8 in general are not unique. Even the method of constructing 
them contains definite extent of arbitrariness. 

Definition 1.7. A mapping / _1 : Y — > X is called bilateral inverse mapping 
or simply inverse mapping for the mapping / : X — > Y if 

/- 1 o/ = id x , fof- 1 =id Y . (1.3) 

Theorem 1.9. A mapping f : X — > Y possesses both left and right inverse 
mappings I and r if and only if it is bijective. In this case the mappings I and r are 
uniquely determined. They coincide with each other thus determining the unique 
bilateral inverse mapping I = r = 

PROOF. The first proposition of the theorem 1.9 follows from theorems 1.7, 
1.8, and 1.1. Let's prove the remaining propositions of this theorem 1.9. The 
coincidence I = r is derived from the following chain of equalities: 

1 = 1" idy =i°(/°r) = (i«/)«r = idx ° r = r. 

The uniqueness of left inverse mapping also follows from the same chain of 
equalities. Indeed, if we assume that there is another left inverse mapping I', then 
from I = r and V = r it follows that I = I'. 

In a similar way, assuming the existence of another right inverse mapping r' , we 
get I = r and I = r' . Hence, r = r 1 . Coinciding with each other, the left and right 
inverse mappings determine the unique bilateral inverse mapping / _1 = I = r 
satisfying the equalities (1.3). □ 

§ 2. Linear vector spaces. 

Let M be a set. Binary algebraic operation in M is a rule that maps each 
ordered pair of elements x, y of the set M to some uniquely determined element 
z G M. This rule can be denoted as a function z = f(x,y). This notation is called 
a prefix notation for an algebraic operation: the operation sign / in it precedes 
the elements x and y to which it is applied. There is another infix notation 
for algebraic operations, where the operation sign is placed between the elements 
x and y. Examples are the binary operations of addition and multiplication of 
numbers: z — x + y, z = x ■ y. Sometimes special brackets play the role of the 
operation sign, while operands are separated by comma. The vector product of 
three-dimensional vectors yields an example of such notation: z = [x, y]. 

Let K be a numeric field. Under the numeric field in this book we shall 
understand one of three such fields: the field of rational numbers K = Q, the field 
of real numbers K = M, or the field of complex numbers K = C. The operation of 
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multiplication by numbers from the field Kin a set M is a rule that maps each pair 
(a, x) consisting of a number a G K and of an element x G M to some element 
y G M. The operation of multiplication by numbers is written in infix form: 
y = a ■ x. The multiplication sign in this notation is often omitted: y = ax. 

Definition 2.1. A set V equipped with binary operation of addition and with 
the operation of multiplication by numbers from the field K, is called a linear 
vector space over the field K, if the following conditions are fulfilled: 

(1) u + v = v + u for all u, v G V; 

(2) (u + v) + w = u + (v + w) for all u, v, w G V; 

(3) there is an element G V such that v + = v for all v G V; any such 
element is called a zero element. 

(4) for any v G V and for any zero element there is an element v' G V such 
that v + v' = 0; it is called an opposite element for v; 

(5) a ■ (u + v) = a ■ u + a ■ v for any number a G IK and for any two elements 
u,v G V; 

(6) (a + (3) ■ v = a ■ v + (3 ■ v for any two numbers a, (3 G K and for any element 
v G V; 

(7) a ■ {(3 ■ v) = (a(3) ■ v for any two numbers a, (3 G K and for any element 
v G V; 

(8) 1 • v = v for the number 1 G IK and for any element v G V. 

The elements of a linear vector space are usually called the vectors, while 
the conditions (l)-(8) are called the axioms of a linear vector space. We shall 
distinguish rational, real, and complex linear vector spaces depending on which 
numeric field K = Q, K = M, or K = C they arc defined over. Most of the results 
in this book are valid for any numeric field K. Formulating such results, we shall 
not specify the type of linear vector space. 

Axioms (1) and (2) are the axiom of commutativity 1 and the axiom of associa- 
tivity respectively. Axioms (5) and (6) express the distributivity. 

Theorem 2.1. Algebraic operations in an arbitrary linear vector space V pos- 
sess the following properties: 

(9) zero vector G V is unique; 

(10) for any vector veF the vector v' opposite to v is unique; 

(11) the product of the number G IK and any vector v G V is equal to zero 
vector: • v = 0: 

(12) the product of an arbitrary number a G K and zero vector is equal to zero 
vector: a ■ = 0; 

(13) the product of the number — 1 G K and the vector v G V is equal to the 
opposite vector: (—1) • v = v'. 

Proof. The properties (9)-(13) are immediate consequences of the axioms 
(l)-(8). Therefore, they are enumerated so that their numbers form successive 
series with the numbers of the axioms of a linear vector space. 

Suppose that in a linear vector space there are two elements and 0' with the 
properties of zero vectors. Then for any vector v G V due to the axiom (3) we 



1 The system of axioms (l)-(8) is excessive: the axiom (1) can be derived from other axioms. 
I am grateful to A. B. Muftakhov who communicated me this curious fact. 
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have v = v + and v + 0' = v. Let's substitute v = 0' into the first equality and 
substitute v = into the second one. Taking into account the axiom (1), we get 

0' = 0' + = + 0' = 0. 

This means that the vectors and 0' do actually coincide. The uniqueness of zero 
vector is proved. 

Let v be some arbitrary vector in a vector space V. Suppose that there are two 
vectors v' and v" opposite to v. Then 

v + v' = 0, v + v" = 0. 

The following calculations prove the uniqueness of opposite vector: 

v" = v" + = v" + (v + v') = (v" + v) + v' = 

= (v + v") + v' = + v' = v' + = v'. 

In deriving v" = v' above we used the axiom (4), the associativity axiom (2) and 
we used twice the commutativity axiom (1). 

Again, let v be some arbitrary vector in a vector space V. Let's take x = • v, 
then let's add x with x and apply the distributivity axiom (6). As a result we get 

x + x = 0- v + 0- v = (0 + 0)-v = 0- v = x. 

Thus we have proved that x + x = x. Then we easily derive that x = 0: 

x = x + = x + (x + x') = (x + x) + x' = x + x' = 0. 

Here we used the associativity axiom (2). The property (11) is proved. 

Let a be some arbitrary number of a numeric field K. Let's take x = a ■ 0, 
where is zero vector of a vector space V . Then 

x + x = a- + a- = a-(0 + 0) = a- = x. 

Here we used the axiom (5) and the property of zero vector from the axiom (3). 
From the equality x + x = x it follows that x = (see above). Thus, the 
property (12) is proved. 

Let v be some arbitrary vector of a vector space V. Let x = (—1) • v. Applying 
axioms (8) and (6), for the vector x we derive 

v + x=l-v + x = l ■ v + (-1) • v = (1 + (-1)) • v = • v = 0. 

The equality v + x = just derived means that x is an opposite vector for the 
vector v in the sense of the axiom (4). Due to the uniqueness property (10) of the 
opposite vector we conclude that x = v'. Therefore, (—1) ■ v = v'. The theorem is 
completely proved. □ 

Due to the commutativity and associativity axioms we need not worry about 
setting brackets and about the order of the summands when writing the sums of 
vectors. The property (13) and the axioms(7) and (8) yield 

(-1) • v' = (-1) • ((-1) • v) = ((-!)(-!)) • v = 1 • v = v. 
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This equality shows that the notation v' = v for an opposite vector is quite 
natural. In addition, we can write 



-a ■ v 



-(a ■ v) = (—1) • (a ■ v) = (—a) ■ v. 



The operation of subtraction is an opposite operation for the vector addition. It 
is determined as the addition with the opposite vector: x — y = x + (— y). The 
following properties of the operation of vector subtraction 

(a + b) - c = a + (b-c), 
(a — b) + c = a — (b — c), 
(a — b) — c = a — (b + c), 
a • (x — y) = a • x — a ■ y 

make the calculations with vectors very simple and quite similar to the calculations 
with numbers. Proof of the above properties is left to the reader. 

Let's consider some examples of linear vector spaces. Real arithmetic vector 
space M™ is determined as a set of ordered n-tuples of real numbers x ,x n . 
Such n-tuples are represented in the form of column vectors. Algebraic operations 
with column vectors are determined as the operations with their components: 



x 1 




y 1 




x 1 + y 1 


x 2 


+ 


v 2 




x 2 + y 2 


x n 




y n 




x n +y n 



x 1 




a 


x 1 


x 2 




a 


x 2 


x n 




a 


x n 



(2.1) 



We leave to the reader to check the fact that the set M™ of all ordered n-tuplcs 
with algebraic operations (2.1) is a linear vector space over the field M of real 
numbers. Rational arithmetic vector space Q™ over the field Q of rational numbers 
and complex arithmetic vector space C n over the field C of complex numbers are 
defined in a similar way. 

Let's consider the set of m-times continuously diffcrcntiable real-valued func- 
tions on the segment [—1,1] of real axis. This set is usually denoted as C m ([— 1, 1]). 
The operations of addition and multiplication by numbers in C m {[— 1, 1]) are de- 
fined as pointwisc operations. This means that the value of the function / + g at 
a point a is the sum of the values of / and g at that point. In a similar way, the 
value of the function a ■ f at the point a is the product of two numbers a and f(a). 
It is easy to verify that the set of functions C m ([— 1,1]) with pointwise algebraic 
operations of addition and multiplication by numbers is a linear vector space over 
the field of real numbers R. The reader can easily verify this fact. 

Definition 2.2. A non-empty subset U c V in a linear vector space V over a 
numeric field K is called a subspace of the space V if: 

(1) from m, U2 € U it follows that Ui + u 2 € U; 

(2) from u e U it follows that a ■ u e U for any number a e K. 

Let U be a subspace of a linear vector space V. Let's regard U as an isolated 
set. Due to the above conditions (1) and (2) this set is closed with respect to 
operations of addition and multiplication by numbers. It is easy to show that 
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zero vector is an element of U and for any u £ U the opposite vector u' also is 
an element of U. These facts follow from = • u and u' = (—1) • u. Relying 
upon these facts one can easily prove that any subspace U C V, when considered 
as an isolated set, is a linear vector space over the field K. Indeed, we have 
already shown that axioms (3) and (4) are valid for it. Verifying axioms (1), 
(2) and remaining axioms (5)-(8) consists in checking equalities written in terms 
of the operations of addition and multiplication by numbers. Being fulfilled for 
arbitrary vectors of V, these equalities are obviously fulfilled for vectors of subset 
U C V. Since U is closed with respect to algebraic operations, it makes sure that 
all calculations in these equalities are performed within the subset U. 

As the examples of the concept of subspace we can mention the following 
subspaces in the functional space C m ([— 1, 1]): 

- the subspace of even functions (/(— x) = f(x))\ 

- the subspace of odd functions (f(—x) = —f(x)); 

- the subspace of polynomials (f(x) = a n x n + . . . + a\ x + ao). 

§ 3. Linear dependence and linear independence. 

Let vi, ... , v„ be a system of vectors some from some linear vector space V. 
Applying the operations of multiplication by numbers and addition to them we 
can produce the following expressions with these vectors: 

v = ai ■ vi + . . . + a n ■ v n . (3.1) 

An expression of the form (3.1) is called a linear combination of the vectors 
vi, ... , v„. The numbers a\, . . . ,a n are taken from the field K; they are called 
the coefficients of the linear combination (3.1), while vector v is called the value 
of this linear combination. Linear combination is said to be zero or equal to zero if 
its value is zero. 

A linear combination is called trivial if all its coefficients are equal to zero: 
a.\ = . . . = a n = 0. Otherwise it is called nontrivial. 

Definition 3.1. A system of vectors vi,... , v„ in linear vector space V is 
called linearly dependent if there exists some nontrivial linear combination of these 
vectors equal to zero. 

Definition 3.2. A system of vectors vi, ... , v„ in linear vector space V is 
called linearly independent if any linear combination of these vectors being equal 
to zero is necessarily trivial. 

The concept of linear independence is obtained by direct logical negation of the 
concept of linear dependence. The reader can give several equivalent statements 
defining this concept. Here we give only one of such statements which, to our 
knowledge, is most convenient in what follows. 

Let's introduce one more concept related to linear combinations. We say that 
vector v is linearly expressed through the vectors Vi , . . . , v„ if v is the value of 
some linear combination composed of Vi , . . . , v n . 
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Theorem 3.1. The relation of linear dependence of vectors in a linear vector 
space has the following basic properties: 

(1) any system of vectors comprising zero vector is linearly dependent; 

(2) any system of vectors comprising linearly dependent subsystem is linearly 
dependent in whole; 

(3) if a system of vectors is linearly dependent, then at least one of these vectors 
is linearly expressed through others; 

(4) if a system of vectors vi, ... , v„ is linearly independent and if adding the 
next vector v n +i to it we make it linearly dependent, then the vector v n +i 
is linearly expressed through previous vectors vi, ... , v n ; 

(5) if a vector x is linearly expressed through the vectors y i , ... , y m and if each 
one of the vectors yi , ... , y m is linearly expressed through zi , . . . , z n , then 
x is linearly expressed through zi, ... , z n . 

PROOF. Suppose that a system of vectors Vi, ... , v„ comprises zero vector. 
For the sake of certainty we can assume that Vfe = 0. Let's compose the following 
linear combination of the vectors vi, . . . , v„: 

• Vl + . . . + • v fe _i + 1 • Vfe + • v fe+1 + . . . + ■ v„ = 0. 

This linear combination is nontrivial since the coefficient of vector Vfe is nonzero. 
And its value is equal to zero. Hence, the vectors vi, ... , v„ are linearly 
dependent. The property (1) is proved. Suppose that a system of vectors 
Vi , ... , v„ comprises a linear dependent subsystem. Since linear dependence is 
not sensible to the order in which the vectors in a system are enumerated, we can 
assume that first k vectors form linear dependent subsystem in it. Then there 
exists some nontrivial liner combination of these k vectors being equal to zero: 

ai ■ vi + . . . + at ■ Vfe = 0. 

Let's expand this linear combination by adding other vectors with zero coefficients: 

ai ■ vi + . . . + afc • v fe + • Vfe + i + . . . + • v„ = 0. 

It is obvious that the resulting linear combination is nontrivial and its value is 
equal to zero. Hence, the vectors Vi, ... , v n are linearly dependent. The property 
(2) is proved. 

Let assume that the vectors vi, ... , v n are linearly dependent. Then there 
exists a nontrivial linear combination of them being equal to zero: 

ci\ ■ vi + . . . + a n ■ v n = 0. (3.2) 

Non-triviality of the linear combination (3.2) means that at least one of its 
coefficients is nonzero. Suppose that ^ 0. Let's write (3.2) in more details: 



ol\ ■ vi + . . . + afe • Vfe + . . . + a n ■ v„ = 0. 
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Let's move the term a k ■ Vfe to the right hand side of the above equality, and then 
let's divide the equality by — a k - 

ax a k _i a k+1 a n 

Vfe = Vi - . . . Vfe_i Vfe+i - . . . v„. 

oik a k a k a k 

Now we see that the vector v k is linearly expressed through other vectors of the 
system. The property (3) is proved. 

Let's consider a linearly independent system of vectors vi, ... , v„ such that 
adding the next vector v n +i to it we make it linearly dependent. Then there is 
some nontrivial linear combination of vectors vi, ... , v n+ i being equal to zero: 

ai • vi + . . . + a n ■ v n + a n+ i ■ v„ +1 = 0. 

Let's prove that a n +i =/= 0. If, conversely, we assume that a n +i = 0, we would get 
the nontrivial linear combination of n vectors being equal to zero: 

ai ■ vi + . . . + a n ■ v„ = 0. 

This contradicts to the linear independence of the first n vectors vi, ... , v„. 
Hence, a n+ i ^ 0, and we can apply the trick already used above: 

Oil Oi n 
v„+i = Vl - . . . v„. 

a n +l Oin+l 

This expression for the vector v„ + i completes the proof of the property (4). 

Suppose that the vector x is linearly expressed through yi, ... , y m , and each 
one of the vectors yi, ... , y m is linearly expressed through zi, ... , z„. This fact 
is expressed by the following formulas: 



i=l j=l 

Substituting second formula into the first one, for the vector x we get 

m / n \ n / m \ 

i=l \ 3 = 1 / 3 = 1 \ i=l / 

The above expression for the vector x shows that it is linearly expressed through 
vectors z\, ... , z„. The property (5) is proved. This completes the proof of 
theorem 3.1 in whole. □ 

Note the following important consequence that follows from the property (2) in 
the theorem 3.1. 



COROLLARY. Any subsystem in a linearly independent system of vectors is lin- 
early independent. 
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The next property of linear dependence of vectors is known as Steinitz theorem. 
It describes some quantitative feature of this concept. 

Theorem 3.2 (Steinitz). If the vectors xi, ... , x„ are linear independent and 
if each of them is expressed through the vectors yi, ... , y TO , then m ^ n. 

Proof. We shall prove this theorem by induction on the number of vectors in 
the system x 1; ... , x„. Let's begin with the case n = 1. Linear independence of a 
system with a single vector Xi means that xi ^ 0. In order to express the nonzero 
vector xi through the vectors of a system yi , ... , y m this system should contain 
at least one vector. Hence, m ^ 1. The base step of induction is proved. 

Suppose that the theorem holds for the case n — k. Under this assumption 
let's prove that it is valid for n = k + 1. If n = k + 1 we have a system of 
linearly independent vectors xi, ... , Xfc+i, each vector being expressed through 
the vectors of another system yi, ... , y m . We express this fact by formulas 



xi = a n ■ yi + . . . + a lm ■ y m , 



Xfe = Offci • yi + . . . + a km ■ Yn 



(3.3) 



We shall write the analogous formula expressing x^+i through yi, ... , y m in a 
slightly different way: 

xfe+i = Pi- yi + • • • + P m ■ y m . 

Due to the linear independence of vectors xi, ... , Xfe+i the last vector Xk+i of this 
system is nonzero (as well as other ones). Therefore at least one of the numbers 
Pi, . . . ,(3 m is nonzero. Upon renumerating the vectors yi, ... , y m , if necessary, 
we can assume that m ^ 0. Then 

1 Pi Pm-l /„ ,x 

Ym = -5- • x fc+ i - — ■ yi - . . . y m _i. (3.4) 

Pm Pm Pm 

Let's substitute (3.4) into the relationships (3.3) and collect similar terms in them. 
As a result the relationships (3.4) are written as 

m-l , s 

x * - -s— • x fe+i = 2^ _ fair- J • yj> ( 3 - 5 ) 

Pm . =1 \ Pm / 

where i = 1, . . . , k. In order to simplify (3.5) we introduce the following notations: 
x i = Xi - — • x k +i, a l3 = ay - ft— • (3.6) 

Pm Pm 

In these notations the formulas (3.5) are written as 

x * = a *u ■ yi + • • • + a l m -i ■ ym-i, 



(3.7) 



x fe = «fei • yi + • • • + "fem-i • ym-i- 
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According to the above formulas, k vectors x*, ... , xjji are linearly expressed 
through yi, ... , y m -i- In order to apply the inductive hypothesis we need to 
show that the vectors x*, ... , x£ are linearly independent. Let's consider a linear 
combination of these vectors being equal to zero: 

71 • xj + . . . + 7 fc • x£ = 0. (3.8) 

Substituting (3.6) for x* in (3.8), upon collecting similar terms, we get 

71 ■ xi + . . . + 7fe ■ Xfc - ^ H^f 1 ■ Xfe+i = 0. 

\i=i Pm J 

Due to the linear independence of the initial system of vectors xi, ... , x^+i we 
derive 71 = . . . = -fk = 0. Hence, the linear combination (3.8) is trivial, which 
proves the linear independence of vectors x| , ... , x£. Now, applying the inductive 
hypothesis to the relationships (3.7), we get m — 1 ^ k. The required inequality 
m > k + 1 proving the theorem for the case n = k + 1 is an immediate consequence 
of m > k + 1. So, the inductive step is completed and the theorem is proved. □ 

§ 4. Spanning systems and bases. 

Let S C V be some non-empty subset in a linear vector space V. The set S 
can consist of cither finite number of vectors, or of infinite number of vectors. We 
denote by (S) the set of all vectors, each of which is linearly expressed through 
some finite number of vectors taken from S: 

(S) = {v e V : 3 n (v = a\ ■ s\ + . . . + a n ■ s„, where s, € S)}. 

This set (S) is called the linear span of a subset S C V. 

Theorem 4.1. The linear span of any subset S C V is a subspace in a linear 
vector space V. 

Proof. In order to prove this theorem it is sufficient to check two conditions 
from the definition 2.2 for (S). Suppose that ui,U2 € (S). Then 

ui = ai • si + . . . + a n ■ s„, 
u 2 = /?i • s* + ... +(3 m ■ s* m . 

Adding these two equalities, we see that the vector Ui + u 2 also is expressed as a 
linear combination of some finite number of vectors taken from S. Therefore, we 
have Ui + u 2 € (S). 

Now suppose that u e (S). Then u = «i • Si + . . . + a n ■ s„. For the vector 
a • u, from this equality we derive 

a ■ u = (aai) ■ si + . . . + (aa„) • s„. 

Hence, a • u e (S). Both conditions (1) and (2) from the definition 2.2 for (S) are 
fulfilled. Thus, the theorem is proved. □ 
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Theorem 4.2. The operation of passing to the linear span in a linear vector 
space V possesses the following properties: 

(1) if S C U and if U is a subspace in V, then (S) C U; 

(2) the linear span of a subset S C V is the intersection of all subspaces com- 
prising this subset S. 

Proof. Let u e {S) and S c U, where U is a subspace. Then for the vector u 
we have u = cti ■ S\ + . . . + a n ■ s„, where Sj e S. But e S and S C U implies 
s, € U . Since U is a subspace, the value of any linear combination of its elements 
again is an element of U. Hence, u £ U. This proves the inclusion (S) C U. 

Let's denote by W the intersection of all subspaces of V comprising the subset 
S. Due to the property (1), which is already proved, the subset (S) is included 
into each of such subspaces. Therefore, (S) C W. On the other hand, (S) is a 
subspace of V comprising the subset S (see theorem 4.1). Hence, (S) is among 
those subspaces forming W. Then W C (S). From the two inclusions (S) C W 
and W C (S) it follows that (S) = W. The theorem is proved. □ 

Let (5) = U. Then we say that the subset S CV spans the subspace U, i.e. S 
generates U by means of the linear combinations. This terminology is supported 
by the following definition. 

Definition 4.1. A subset S c V is called a generating subset or a spanning 
system of vectors in a linear vector space V if (S) = V. 

A linear vector space V can have multiple spanning systems. Therefore the 
problem of choosing of a minimal (is some sense) spanning system is reasonable. 

Definition 4.2. A spanning system of vectors S c V in a linear vector space 
V is called a minimal spanning system if none of smaller subsystems S' £ S is a 
spanning system in V, i. e. if (S') ^ V for all S' £ S. 

Definition 4.3. A system of vectors S c V is called linearly independent if 
any finite subsystem of vectors si, . . . , s n taken from S is linearly independent. 

This definition extends the definition 3.2 for the case of infinite systems of 
vectors. As for the spanning systems, the relation of the properties of minimality 
and linear independence for them is determined by the following theorem. 

Theorem 4.3. A spanning system of vectors S C V is minimal if and only if it 
is linearly independent. 

PROOF. If a spanning system of vectors S C V is linearly dependent, then it 
contains some finite linearly dependent set of vectors si, ... , s„. Due to the item 
(3) in the statement of theorem 3.1 one of these vectors is linearly expressed 
through others. Then the subsystem S' = S\{sk} obtained by omitting this 
vector Sfe from S is a spanning system in V. This fact obviously contradicts the 
minimality of S (see definition 4.2 above). Therefore any minimal spanning system 
of vectors in V is linearly independent. 

If a spanning system of vectors S C V is not minimal, then there is some 
smaller spanning subsystem S' £ 5, i. c. subsystem S' such that 



(S<) = (S) = V. 



(4.1) 
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In this case we can choose some vector so £ S such that so ^ S'. Due to (4.1) this 
vector is an element of (S'). Hence, so is linearly expressed through some finite 
number of vectors taken from the subsystem 5": 

s = ai • si + . . . + a n ■ s n . (4.2) 

One can easily transform (4.2) to the form of a linear combination equal to zero: 

(-1) • s + ai ■ si + . . . + a n ■ s„ = 0. (4.3) 

This linear combination is obviously nontrivial. Thus, we have found that the 
vectors so, . . . , s„ form a finite linearly dependent subset of S. Hence, S is linearly 
dependent (see the item (2) in theorem 3.1 and the definition 4.2). This fact means 
that any linearly independent spanning system of vector in V is minimal. □ 

Definition 4.4. A linear vector space V is called finite dimensional if there is 
some finite spanning system of vectors S = {xi, . . . , x„} in it. 

In an arbitrary linear vector space V there is at lease one spanning system, e. g. 
S = V. However, the problem of existence of minimal spanning systems in general 
case is nontrivial. The solution of this problem is positive, but it is not elementary 
and it is not constructive. This problem is solved with the use of the axiom of 
choice (see [1]). Finite dimensional vector spaces are distinguished due to the fact 
that the proof of existence of minimal spanning systems for them is elementary. 

Theorem 4.4. In a finite dimensional linear vector space V there is at least one 
minimal spanning system of vectors. Any two of such systems {xi, ... , x„} and 
{yi, . . . , y„} have the same number of elements n. This number n is called the 
dimension of V, it is denoted as n = dim V. 

PROOF. Let S = {xi, . . . , x^} be some finite spanning system of vectors in a 
finite-dimensional linear vector space V. If this system is not minimal, then it is 
linear dependent. Hence, one of its vectors is linearly expressed through others. 
This vector can be omitted and we get the smaller spanning system S' consisting 
of k — 1 vectors. If S' is not minimal again, then we can iterate the process getting 
one less vectors in each step. Ultimately, we shall get a minimal spanning system 
Smin in V with finite number of vectors n in it: 

S mi „ = {yi, ... , y„}. (4.4) 

Usually, the minimal spanning system of vectors (4.4) is not unique. Suppose 
that {xi, . . . , x m } is some other minimal spanning system in V. Both systems 
{xi, . . . , x m } and {yi, . . . , y„} are linearly independent and 

x l £ (yi, . .. , y„) for i = 1, .. . , m, 

(4.5) 

y l £ (xi, . . . , x m ) for i = l, ... ,n. 

Due to (4.5) we can apply Steinitz theorem 3.2 to the systems of vectors 
{xi, . . . , x TO } and {yi, . . . , y„}. As a result we get two inequalities n > m 
and m > n. Therefore, m = n = dim V. The theorem is proved. □ 
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The dimension dim V is an integer invariant of a finite-dimensional linear vector 
space. If dim V = n, then such a space is called an n-dimensional space. Returning 
to the examples of linear vector spaces considered in § 2, note that dimR™ = n, 
while the functional space C m ([— 1, 1]) is not finite-dimensional at all. 

Theorem 4.5. Let V be a finite dimensional linear vector space. Then the 
following propositions are valid: 

(1) the number of vectors in any linearly independent system of vectors Xi , ... , xj. 
in V is not greater than the dimension ofV; 

(2) any subspace U of the space V is finite-dimensional and dim U < dim V; 

(3) for any subspace U in V if dim U = dim V , then U — V; 

(4) any linearly independent system of n vectors xi, ... , x„, where n = diml^, 
is a spanning system in V. 

Proof. Suppose that dim]/ = n. Let's fix some minimal spanning system of 
vectors yi , • • ■ , y n in V. Then each vector of the linear independent system of 
vectors x l7 . . . , x^. in proposition (1) is linearly expressed through yi, ... , y„. 
Applying Stcinitz theorem 3.2, we get the inequality k ^ n. The first proposition 
of theorem is proved. 

Let's consider all possible linear independent systems Ui, ... , life composed 
by the vectors of a subspace U. Due to the proposition (1), which is already 
proved, the number of vectors in such systems is restricted. It is not greater than 
n = dimT^. Therefore we can assume that Ui, . . . , life is a linearly independent 
system with maximal number of vectors: k — fc max $C n = dimV. If u is an 
arbitrary vector of the subspace U and if we add it to the system m, ... , Ufe, 
we get a linearly dependent system; this is because k = fe max - Now, applying 
the property (4) from the theorem 3.1, we conclude that the vector u is linearly 
expressed through the vectors Ui, ... , u^. Hence, the vectors m, . . . , Ufc form 
a finite spanning system in U . It is minimal since it is linearly independent (see 
theorem 4.3). Finite dimensionality of U is proved. The estimate for its dimension 
follows from the above inequality: dim U = k < n = dim V. 

Let U again be a subspace in V. Assume that diml/ = dimF = n. Let's 
choose some minimal spanning system of vectors ui, . . . , u n in U. It is linearly 
independent. Adding an arbitrary vector v e V to this system, we make it linearly 
dependent since in V there is no linearly independent system with (n + 1) vectors 
(see proposition (1), which is already proved). Furthermore, applying the property 
(3) from the theorem 3.1 to the system m, . . . , u„, v, we find that 

v = ai ■ ui + . . . + oe m ■ u m . 

This formula means that v € U , where v is an arbitrary vector of the space V. 
Therefore, U = V. The third proposition of the theorem is proved. 

Let xi, ... ,x n be a linearly independent system of n vectors in V, where n 
is equal to the dimension of the space V. Denote by U the linear span of this 
system of vectors: U = (xi, . . . , x„). Since xi, ... , x„ are linearly independent, 
they form a minimal spanning system in U. Therefore, dim U = n = dim V. Now, 
applying proposition (3) of the theorem, we get 

(xi, . . . , x„) = U = V. 
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This equality proves the fourth proposition of theorem 4.5 and completes the proof 
of the theorem in whole. □ 

Definition 4.5. A minimal spanning system ei, . . . , e„ with some fixed order 
of vectors in it is called a basis of a finite-dimensional vector space V. 

Theorem 4.6 (basis criterion). An ordered system of vectors ei, . . . , e„ is 
a basis in a Unite- dimensional vector space V if and only if 

(1) the vectors ei, ... , e„ are linearly independent; 

(2) an arbitrary vector of the space V is linearly expressed through ei, ... , e„. 

Proof is obvious. The second condition of theorem means that the vectors 
d, ... , e„ form a spanning system in V, while the first condition is equivalent to 
its minimality. 

In essential, theorem 4.6 simply reformulates the definition 4.5. We give it here 
in order to simplify the terminology. The terms «spanning system* and «minimal 
spanning system» are huge and inconvenient for often usage. 

Theorem 4.7. Let e\, . . . , e s be a basis in a subspace U C V and IctveV be 
some vector outside this subspace: v ^ U. Then the system of vectors ei, ... , e s , v 
is a linearly independent system. 

PROOF. Indeed, if the system of vectors ei, ... , e s , v is linearly dependent, 
while , . . . , Gg IS cl linearly independent system, then v is linearly expressed 
through the vectors ei, ... , e s , thus contradicting the condition v ^ U. This 
contradiction proves the theorem 4.7. □ 

Theorem 4.8 (on completing the basis). Let U be a subspace in a finite- 
dimensional linear vector space V. Then any basis ei , ... , e s ofU can be completed 
up to a basis ei, ... , e s , e s +i, . . . , e„ in V. 

PROOF. Let's denote U = Uq. If Uq = V, then there is no need to complete the 
basis since ei, . . . , e s is a basis in V. Otherwise, if Uq ^ V, then let's denote by 
e s+ i some arbitrary vector of V taken outside the subspace £/ - According to the 
above theorem 4.7, the vectors ei, ... , e s , e s+ i are linearly independent. 

Let's denote by U\ the linear span of vectors ei, ... , e s , e s +i. For the subspace 
U\ we have the same two mutually exclusive options U\ = V or Ui ^ V, as we 
previously had for the subspace Uq- If U\ = V, then the process of completing the 
basis ei, . . . , e s is over. Otherwise, we can iterate the process and get a chain of 
subspaces enclosed into each other: 

U £ Ui £ U 2 £ • • • • 

This chain of subspaces cannot be infinite since the dimension of every next 
subspace is one as greater than the dimension of previous subspace, and the 
dimensions of all subspaces are not greater than the dimension of V. The process 
of completing the basis will be finished in (n — s)-th step, where U n - S = V. □ 

§ 5. Coordinates. Transformation of the 
coordinates of a vector under a change of basis. 

Let V be some finite-dimensional linear vector space over the field K and let 
dimF = n. In this section we shall consider only finite-dimensional spaces. Let's 
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choose a basis ei, . . . , e„ in V. Then an arbitrary vector x G V can be expressed 
as linear combination of the basis vectors: 



ei + . . . + x n ■ e r , 



(5.1) 



The linear combination (5.1) is called the expansion of the vector x in the basis 
ei, . . . , e n . Its coefficients x 1 , ... , x n are the elements of the numeric field K. 
They are called the components or the coordinates of the vector x in this basis. 

We use upper indices for the literal notations of the coordinates of a vector x in 
(5.1). The usage of upper indices for the coordinates of vectors is determined by 
special convention, which is known as tensorial notation. It was introduced for to 
simplify huge calculations in differential geometry and in theory of relativity (see 
[2] and [3] ) . Other rules of tensorial notation are discussed in coordinate theory of 
tensors (see [7] 1 ). 

Theorem 5.1. -For any vector x £V its expansion in a basis of a linear vector 
space V is unique. 

Proof. The existence of an expansion (5.1) for a vector x follows from the 
item (2) of theorem 4.7. Assume that there is another expansion 



x = x 



ei + . . . + x' 



Subtracting (5.1) from this equality we get 



(x 



ei + . . . + (x' n - x n ) ■ e„. 



(5.2) 



(5.3) 



Since basis vectors ei, ... , e„ are linearly independent, from the equality (5.3) it 
follows that the linear combination (5.3) is trivial: x n — x l = 0. Then 

t'I _ ™i ~'n — -r n 

Hence the expansions (5.1) and (5.2) do coincide. The uniqueness of the expansion 
(5.1) is proved. □ 

Having chosen some basis ei, ... , e„ in a space V and expanding a vector x in 
this base we can write its coordinates in the form of column vectors. Due to the 
theorem 5.1 this determines a bijective map ip : V —* K". It is easy to verify that 



ip(x + y) 



x 1 + y 



x n + y n 



ip(a ■ x) 



a ■ x 



(5.4) 



The above formulas (5.4) show that a basis is a very convenient tool when 
dealing with vectors. In a basis algebraic operations with vectors are replaced by 
algebraic operations with their coordinates, i. e. with numeric quantities. However, 
coordinate approach has one disadvantage. The mapping ip essentially depends on 
the basis we choose. And there is no canonic choice of basis. In general, none 
of basis is preferable with respect to another. Therefore we should be ready to 



1 The reference [7] is added in 2004 to English translation of this book. 
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consider various bases and should be able to recalculate the coordinates of vectors 
when passing from a basis to another basis. 

Let ei, ... , e n and ei, ... , e„ be two arbitrary bases in a linear vector space 
V. We shall call them «wavy» basis and «non-wavy» basis (because of tilde sign 
we use for denoting the vectors of one of them) . The non-wavy basis will also be 
called the initial basis or the old basis, and the wavy one will be called the new 
basis. Taking i-th vector of new (wavy) basis, we expand it in the old basis: 

e; = Sj ■ ei + . . . + Sj* • e„. (5.5) 



According to the tensorial notation, the coordinates of the vector §j in the 
expansion (5.5) are specified by upper index. The lower index i specifies the 
number of the vector e, being expanded. Totally in the expansion (5.5) we 
determine n 2 numbers; they are usually arranged into a matrix: 



s\ 



S2 



(5.6) 



Upper index j of the matrix element Sf specifies the row number; lower index i 
specifies the column number. The matrix S in (5.6) the direct transition matrix 
for passing from the old basis ei, ... , e„ to the new basis §i, ... , e„. 

Swapping the bases ei, ... , e„ and §i, . . . , e n we can write the expansion of 
the vector ej in wavy basis: 



(5.7) 



The coefficients of the expansion (5.7) determine the matrix T, which is called the 
inverse transition matrix. Certainly, the usage of terms «dircct» and «inverse» 
here is relative; it depends on which basis is considered as an old basis and which 
one is taken for a new one. 

Theorem 5.2. The direct transition matrix S and the inverse transition matrix 
T determined by the expansions (5.5) and (5.7) are inverse to each other. 

Remember that two square matrices are inverse to each other if their product 
is equal to unit matrix: ST = 1. Here we do not define the matrix multiplication 
assuming that it is known from the course of general algebra. 

PROOF. Let's begin the proof of the theorem 5.2 by writing the relationships 
(5.5) and (5.7) in a brief symbolic form: 



§j = £;Sf-e fc , ej = ^2Tj-ei. (5.8) 

fe=i i=i 

Then we substitute the first relationship (5.8) into the second one. This yields: 

n / n \ n / n \ 

i=i \ k=i / fe=i V i=i / 
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The symbol Sj, which is called the Kronecker symbol, is used for denoting the 
following numeric array: 

, f 1 for k = j, 
5 , = , , / ( 5 - 10 ) 



for k 7^ j. 

We apply the Kronecker symbol determined in (5.10) in order to transform left 
hand side of the equality (5.9): 

n 

«i = (5.11) 

fc=i 

Both equalities (5.11) and (5.9) represent the same vector ej expanded in the same 
basis ei, ... , e„. Due to the theorem 5.1 on the uniqueness of the expansion of a 
vector in a basis we have the equality 

n 

i=i 

It is easy to note that this equality is equivalent to the matrix equality ST = 1. 
The theorem is proved. □ 

COROLLARY. The direct transition matrix S and the inverse transition matrix 
T both are non-degenerate matrices and det S det T = 1. 

PROOF. The relationship det S det T = 1 follows from the matrix equality 
ST = 1, which was proved just above. This fact is well known from the course 
of general algebra. If the product of two numbers is equal to unity then none of 
these two numbers can be equal to zero: 

detS^O, detTVO. 

This proves the non-degeneracy of transition matrices S and T. The corollary is 
proved. □ 

Theorem 5.3. Every non-degenerate nxn matrix S can be obtained as a tran- 
sition matrix for passing from some basis ei , . . . , e„ to some other basis §i , ... , e„ 
in a linear vector space V of the dimension n. 

PROOF. Let's choose an arbitrary ei, ... , e„ basis in V and fix it. Then let's 
determine the other n vectors ei, . . . , e n by means of the relationships (5.5) and 
prove that they are linearly independent. For this purpose we consider a linear 
combination of these vectors that is equal to zero: 

a 1 ■ ei + .. . + a n ■ e„ = 0. (5.12) 

Substituting (5.5) into this equality one can transform it to the following one: 

(l>* ai ) -ei + ...+ (E 5 >^ • e « = °- 

Since the basis vectors ei, ... , e„ are linearly independent, it follows that all 
sums enclosed within the brackets in the above equality are equal to zero. Writing 
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these sums in expanded form, we get a homogeneous system of linear algebraic 
equations with respect to the variables a 1 , ... , a n : 

Si a 1 + ... + S*a n = 0, 



S^a 1 + ... + S™a n = 0. 

The matrix of coefficients of this system coincides with S. From the course of 
algebra we know that each homogeneous system of linear equations with non- 
degenerate square matrix has unique solution, which is purely zero: 

a 1 = . . . = a n = 0. 

This means that an arbitrary linear combination (5.12), which is equal to zero, is 
necessarily trivial. Hence, ei, . . . , e n is a linear independent system of vectors. 
Applying the proposition (4) from the theorem 4.5 to these vectors, we find that 
they form a basis in V, while the matrix S appears to be a direct transition matrix 
for passing from ei, ... , e„ to e\, ... , e„. The theorem is proved. □ 

Let's consider two bases ei, . . . , e„ and ei, . . . , e„ in a linear vector space V 
related by the transition matrix S. Let x be some arbitrary vector of the space V. 
It can be expanded in each of these two bases: 

n n 

x = ^x fc -e fe , x = '^2x l -e i . (5.13) 

k=l i=l 

Once the coordinates of x in one of these two bases are fixed, this fixes the vector 
x itself, and, hence, this fixes its coordinates in another basis. 

Theorem 5.4. The coordinates of a vector x in two bases ei, ... , e„ and 
§i, . . . , e„ of a linear vector space V are related by formulas 

n n 

x k =Y,S^x\ x* = ]TT^, (5.14) 

i=l k=l 

where S and T are direct and inverse transition matrices for the passage from 
ei, . . . , e„ to §i,... ,e„, i.e. when ei, . . . , e„ is treated as an old basis and 
ei, . . . , e„ is treated as a new one. 

The relationships (5.14) are known as transformation formulas for the coordi- 
nates of a vector under a change of basis. 

Proof. In order to prove the first relationship (5.14) we substitute the expan- 
sion of the vector §j taken from (5.8) into the second relationship (5.13): 

n / n \ n / n \ 

x = £^ £^- e * =£ £^ 

i=l V fe=l / fe=l \ i=l / 

Comparing this expansion x with the first expansion (5.13) and applying the 
theorem on uniqueness of the expansion of a vector in a basis, we derive 



x 
l 
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This is exactly the first transformation formula (5.14). The second formula (5.14) 
is proved similarly. □ 

§ 6. Intersections and sums of subspaces. 

Suppose that we have a certain number of subspaces in a linear vector space 
V. In order to designate this fact we write U C V, where i £ I. The number of 
subspaces can be finite or infinite enumerable, then they can be enumerated by 
the positive integers. However, in general case we should enumerate the subspaces 
by the elements of some indexing set /, which can be finite, infinite enumerable, or 
even non-enumerable. Let's denote by U and by S the intersection and the union 
of all subspaces that we consider: 

U=f]U h S = |J Ui. (6.1) 

iei iei 

Theorem 6.1. The intersection of an arbitrary number of subspaces in a linear 
vector space V is a subspace in V. 

Proof. The set U in (6.1) is not empty since zero vector is an element of each 
subspace U. Let's verify the conditions (1) and (2) from the definition 2.2 for U. 

Suppose that m, U2, and u are the vectors from the subset U. Then they 
belong to Ui for each i e I. However, U is a subspace, hence, u\ + u 2 € U and 
a ■ u g Ui for any i g I and for any a g K. Therefore, u\ + u 2 g U and a ■ u g U. 
The theorem is proved. □ 

In general, the subset S in (6.1) is not a subspace. Therefore we need to 
introduce the following concept. 

Definition 6.1. The linear span of the union of subspaces U, i g J, is called 
the sum of these subspaces. 

For to denote the sum of subspaces W = (S) we use the standard summation sign: 

iei 

Theorem 6.2. A vector w of a linear vector space V belongs to the sum of 
subspaces Ui, i g J, if and only if it is represented as a sum of finite number of 
vectors each of which is taken from some subspace U: 

w = Ui 1 + . . . + m k , where Ui g U. (6.2) 

PROOF. Let S be the union of subspaces U c V, i g /. Suppose that w G W. 
Then w is a linear combination of finite number of vectors taken from S: 

W = OL\ ■ Sl + . . . + CXk ■ Sk- 

But S is the union of subspaces U. Therefore, s m g U m and a m - s m = u im g Ui m , 
where m=l, ... ,k. This leads to the equality (6.2) for the vector w. 
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Conversely, suppose that w is a vector given by formula (6.2). Then Ui m G Ui m 
and Ui m C 5*, i. e. Ui m £ S. Therefore, the vector w belongs to the linear span of 
S. The theorem is proved. □ 

Definition 6.2. The sum W of subspaces Ui, i £ J, is called the direct sum, 
if for any vector w £ W the expansion (6.2) is unique. In this case for the direct 
sum of subspaces we use the special notation: 

W = ®Ui. 

iei 



Theorem 6.3. Let W ~ Ui + ... + Uk be the sum a finite number of hnitc- 
dimensional subspaces. The dimension of W is equal to the sum of dimensions of 
the subspaces Ui if and only if W is the direct sum: W = U\ © . . . © Uk- 

Proof. Let's choose a basis in each subspace Ui. Suppose that dimt/i = Sj 
and let eu, . . . , ej s . be a basis in Ui. Let's join the vectors of all bases into one 
system ordering them alphabetically: 

en, ■•■ , ei Sl , e k i, ■ ■ ■ , e kSk . (6.3) 

Due to the equality W = U\ + . . . + Uk for an arbitrary vector w of the subspace 
W we have the expansion (6.2): 

w = ui + . . . + u fc , where £ Ui. (6.4) 

Expanding each vector of (6.4) in the basis of corresponding subspace Ui, we 
get the expansion of w in vectors of the system (6.3). Hence, (6.3) is a spanning 
system of vectors in W (though, in general case it is not a minimal spanning 
system). 

If dimVK = dimf/i + . . . + dimUk, then the number of vectors in (6.3) cannot 
be reduced. Therefore (6.3) is a basis in W. From any expansion (6.4) we can 
derive the following expansion of the vector w in the basis (6.3): 



w 



The sums enclosed into the round brackets in (6.5) are determined by the expan- 
sions of the vectors m, ... , Ufc in the bases of corresponding subspaces U\, . . . ,Uk- 

Ui = y^ j a i j ■ ejj. (6.6) 
j=i 

Due to (6.6) the existence of two different expansions (6.4) for some vector w 
would mean the existence of two different expansions (6.5) of this vector in the 
basis (6.3). Hence, the expansion (6.4) is unique and the sum of subspaces 
W = U\ + . . . + Uk is the direct sum. 
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Conversely, suppose that W = U\ © . . . © Uk- We know that the vectors (6.3) 
span the subspace W. Let's prove that they are linearly independent. For this 
purpose we consider a linear combination of these vectors being equal to zero: 



(6.7) 



Let's denote by Ui, ... , the values of sums enclosed into the round brackets in 
(6.7). It is easy to see that e Ui, therefore, (6.7) is an expansion of the form 
(6.4) for the vector w = 0. But = + . . . + and G Ui. This is another 
expansion for the vector w = 0. However, W = U\ © . . . © Uk, therefore, the 
expansion = + . . . + is unique expansion of the form (6.4) for zero vector 
w = 0. Then we have the equalities 

H 

= dij ■ eij for all i = 1, . . . , k. 

3=1 

It's clear that these equalities are the expansions of zero vector in the bases of 
the subspaces Ui. Hence, ctij = 0. This means that the linear combination (6.7) 
is trivial, and (6.3) is a linearly independent system of vectors. Thus, being a 
spanning system and being linearly independent, the system of vectors (6.3) is a 
basis of W. Now we can find the dimension of the subspace W by counting the 
number of vectors in (6.3): dim W = S\ + . . . + Sfe = dim U\ + . . . + dim Uk- The 
theorem is proved. □ 

Note. If the sum of subspaces W = U\ + . . . + Uk is not necessarily the direct 
sum, the vectors (6.3), nevertheless, form a spanning system in W. But they do 
not necessarily form a linearly independent system in this case. Therefore, we 
have 

dimVy dimC/i + . . . + dim[/ fe . (6.8) 

Sharpening this inequality in general case is sufficiently complicated. We shall do 
it for the case of two subspaces. 

Theorem 6.4. The dimension of the sum of two arbitrary finite-dimensional 
subspaces U\ and £/ 2 in a linear vector space V is equal to the sum of their dimen- 
sions minus the dimension of their intersection: 

dim([/ 1 + U 2 ) = dim U x + dim U 2 - dim(?7i n U 2 ). (6.9) 

PROOF. From the inclusion U\ fl U 2 C U\ and from the inequality (6.8) we 
conclude that all subspaces considered in the theorem are finite-dimensional. Let's 
denote 6xa\{U\ n U 2 ) — s and choose a basis e\, . . . , e s in the intersection U\ n U 2 . 

Due to the inclusion U\ fl U 2 C U\ we can apply the theorem 4.8 on completing 
the basis. This theorem says that we can complete the basis ei, . . . , e s of the 
intersection U\f\U 2 up to a basis ei, ... , e s , e s +i, . . . , e s+p in U\. For the 
dimension of U\ , we have dim U\ = s + p. In a similar way, due to the inclusion 
U\ n U 2 C U 2 we can construct a basis ei, ... , e s , e s + p +i, . . . , e s+p+q in U 2 . For 
the dimension of U 2 this yields dim U 2 = s + q. 
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Now let's join together the two bases constructed above with the use of 
theorem 4.8 and consider the total set of vectors in them: 

d, . . . , e s , e s+ i, . . . , e s+p , e s+p+ i, . . . , e s+p+q . (6.10) 

Let's prove that these vectors (6.10) form a basis in the sum of subspaces U\ + U 2 - 
Let w be some arbitrary vector in Ui + U 2 . The relationship (6.2) for this vector 
is written as w = U[ + u 2 . Let's expand the vectors U! and u 2 in the above two 
bases of the subspaces U\ and U 2 respectively: 

s p 
s q 

u 2 = ^ a, • e» + ^ ls+p+j ■ e s+p +j . 
i=i j=i 

Adding these two equalities, we find that the vector w is linearly expressed through 
the vectors (6.10). Hence, (6.10) is a spanning system of vectors in U\ + U 2 . 

In order to prove that (6.10) is a linearly independent system of vectors wc 
consider a linear combination of these vectors being equal to zero: 

s+p q 

'^2a i -e i + '^2a s+p+i -e s+p+ i = 0. (6-H) 

i=l i=l 

Then we transform this equality by moving the second sum to the right hand side: 

s+p q 



^ ^ &i ' &i — ^ ^ &s+p+i ' ^s+p+i- 



i=l 



Let's denote by u the value of left and right sides of this equality. Then for the 
vector u we get the following two expressions: 



s+p q 



u = ^a { ■ e,, u = - a s +p+i ■ e s+p+i . (6-12) 



=1 



Because of the first expression (6.12) we have u e U\, while the second expression 
(6.12) yields u G U 2 . Hence, u e U\ n U 2 . This means that we can expand the 
vector u in the basis ei, ... , e s : 

s 

u = 5^A-ei. (6.13) 

i=l 

Comparing this expansion with the second expression (6.12), we find that 

s q 

^2 Pi ■ ei + ^2a s+p+i ■ e s+p+i = 0. (6.14) 



i=i 
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Note that the vectors ei, ... , e s , e s + p +i, . . . , e s+p+q form a basis in U<2. They 
are linearly independent. Therefore, all coefficients in (6.14) arc equal to zero. In 
particular, we have the following equalities: 

a s + P +i = ■■■ = a s + P +q = 0. (6.15) 

Moreover, /3i = . . . = (3 S = 0. Due to (6.13) this means that u = 0. Now from the 
first expansion (6.12) we get the equality 

s+p 

■ = 0. 

Since ei, ... , e s , e s +i, . . . , e s+p are linearly independent vectors, all coefficients 
tti in the above equality should be zero: 

on = . . . = a s = a s+1 = . . . = a s+p = 0. (6.16) 

Combining (6.15) and (6.16), we see that the linear combination (6.11) is trivial. 
This means that the vectors (6.10) are linearly independent. Hence, they form a 
basis in U\ +U2- For the dimension of the subspace U\ + U 2 this yields 

dim(f/i + U 2 ) = s+p + q = (s+p) + (s + q) - s = 
= dim U\ + dim U 2 - dim(C/i fl U2). 

Thus, the relationship (6.9) and the theorem 6.4 in whole is proved. □ 

§ 7. Cosets of a subspace. The concept of factorspace. 

Let V be a linear vector space and let U be a subspace in it. A coset of the 
subspace U determined by a vector v e V is the following set of vectors 1 : 

Cbj(v) = {w e V: w - v e U}. (7.1) 

The vector v in (7.1) is called a representative of the coset (7.1). The coset Clu(v) 
is a very simple thing, it is obtained by adding the vector v with all vectors of the 
subspace U. The coset represented by zero vector is the especially simple thing 
since C\u(0) = U. It is called a zero coset. 

Theorem 7.1. The cosets of a subspace U in a linear vector space V possess 
the following properties: 

(1) a e Cl[/(a) for any a e V; 

(2) if a e Chy(b), then b e Cl[/(a); 

(3) if a e Cl(7(b) and b e Clu(c), then a e Cl[/(c). 

Proof. The first proposition is obvious. Indeed, the difference a — a is equal 
to zero vector, which is an element of any subspace: a — a = e U. Hence, due to 
the formula (7.1), which is the formal definition of cosets, we have a € Cl[/(a). 



1 We used the sign CI for cosets since in Russia they are called adjacency classes. 
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Let a G Cl[/(b). Then a b G f. For b - a, we have b a = (-1) • (a - b). 
Therefore, b — a G J7 and b <E Cby(a) (see formula (7.1) and the definition 2.2). 
The second proposition is proved. 

Let a G Chj(b) and b G Chj(c). Then a b G f7 and b - c g U. Note that 
a c = (a — b) + (b — a). Hence, a — c g U and a g Cby(c) (see formula (7.1) 
and the definition 2.2 again). The third proposition is proved. This completes the 
proof of the theorem in whole. □ 

Let a g Clt/(b). This condition establishes some kind of dependence between 
two vectors a and b. This dependence is not strict: the condition a g Clt/(b) 
does not exclude the possibility that a' g C\jj(b) for some other vector a'. Such 
non-strict dependences in mathematics are described by the concept of binary 
relation (see details in [1] and [4]). Let's write a <~ b as an abbreviation for 
a g Cly(b). Then the theorem 7.1 reveals the following properties of the binary 
relation a ~ b, which is introduced just above: 

(1) reflexivity: a <~ a; 

(2) symmetry: a ~ b implies b <~ a; 

(3) transitivity: a ~ b and b ~ c implies a ~ c. 

A binary relation possessing the properties of reflexivity, symmetry, and transiti- 
vity is called an equivalence relation. Each equivalence relation determined in a 
set V partitions this set into a union of mutually non-intersecting subsets, which 
are called the equivalence classes: 

Cl(v) = {we V: w~ v}. (7.2) 

In our particular case the formal definition (7.2) coincides with the formal defi- 
nition (7.1). In order to keep the completeness of presentation we shall not use 
the notation a <~ b in place of a g Clir(b) anymore, and we shall not refer to the 
theory of binary relations (though it is simple and well-known). Instead of this we 
shall derive the result on partitioning V into the mutually non-intersecting cosets 
from the following theorem. 

Theorem 7.2. If two cosets Cly(a) and Cly(b) of a subspace U c V are 
intersecting, then they do coincide. 

Proof. Assume that the intersection of two cosets Cl[/(a) and CLj(b) is not 
empty. Then there is an element c belonging to both of them: c G Cl[/(a) and 
c g Cly(b). Due to the proposition (2) of the above theorem 7.1 we derive 
b G Cly(c). Combining b G Clj/(c) and c G Cly(a) and applying the proposition 
(3) of the theorem 7.1, we get b G Clt/(a). The opposite inclusion a G Cl[/(b) 
then is obtained by applying the proposition (2) of the theorem 7.1. 

Let's prove that two cosets Cly(a) and Cl[/(b) do coincide. For this purpose 
let's consider an arbitrary vector x G Cly(a). From x G Cbj(a) and a G Clj/(b) 
we derive x G C\(j(h). Hence, Clt/(a) C Clj/(b). The opposite inclusion Cl[/(b) C 
Clt/(a) is proved similarly. From these two inclusions we derive Chj(a) = Clc/(b). 
The theorem is proved. □ 

The set of all cosets of a subspace U in a linear vector space V is called the 
factorset or quotient set V/U. Due to the theorem proved just above any two 
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different cosets Qi and Q 2 from the factorset V/U have the empty intersection 
Qi n Q2 = 0, while the union of all cosets coincides with V: 

V= (J Q. 

Qev/u 

This equality is a consequence of the fact that any vector v e V is an element of 
some coset: v e Q. This coset Q is determined by v according to the formula 
Q = Cl[/(v). For this reason the following theorem is a simple reformulation of 
the definition of cosets. 

Theorem 7.3. Two vectors v and w belong to the same coset of a subspace U 
if and only if their difference v — w is a vector of U. 

Definition 7.1. Let Q 1 and Q 2 be two cosets of a subspace U. The sum 
of cosets Qi and Q2 is a coset Q of the subspace U determined by the equality 
Q = Cl[/(vi + v 2 ), where vi G Qi and v 2 G <2 2 - 

Definition 7.2. Let Q be a coset of a subspace £7. The product of Q and 
a number a e IK is a coset P of the subspace U determined by the relationship 
P = C\u(a ■ v), where v e Q. 

For the addition of cosets and for the multiplication of them by numbers we use 
the same signs of algebraic operations as in case of vectors, i. e. Q = Qi+ Q2 and 
P = a ■ Q. The definitions 7.1 and 7.2 can be expressed by formulas 

Cl^(vi) + Cl v (v 2 )=a u (vi+V2), 

a-Cbj(v) =Clc/(a-v). (? '^ 

These definitions require some comments. Indeed, the coset Q = Qi + Q2 in the 
definition 7.1 and the coset P = a ■ Q in the definition 7.2 both are determined 
using some representative vectors vi G Qi, v 2 € Q2, and v e Q. The choice 
of a representative vector in a coset is not unique; therefore, we need especially 
to prove the uniqueness of the results of algebraic operations determined in the 
definitions 7.1 and 7.2. This proof is called the proof of correctness. 

Theorem 7.4. The dchnitions 7.1 and 7.2 are correct and the results of the 
algebraic operations of coset addition and of coset multiplication by numbers do 
not depend on the choice of representatives in cosets. 

Proof. For the beginning we study the operation of coset addition. Lat's take 
consider two different choices of representatives within cosets Qi and Q 2 . Let 
vi,vi be two vectors of Q\ and let Vi,vi be two vectors of Q2- Then due to the 
theorem 7.3 we have the following two equalities: 

vi — vi e U, v 2 - v 2 e U. 

Hence, (vi + v 2 ) — (vi + v 2 ) = (vi — Vi) + (v 2 — v 2 ) e U. This means that the 
cosets determined by vectors Vi + v 2 and vi + v 2 do coincide with each other: 



C\u(vi + v 2 ) = CluiV! + v 2 ). 
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This proves the correctness of the definition 7.1 for the operation of coset addition. 

Now let's consider two different representatives v and v within the coset Q. 
Then v — v £ U. Hence, a • v — a ■ v — a ■ (v — v) e U . This yields 

Cl u {a-v) = Cl u (a-v), 

which proves the correctness of the definition 7.2 for the operation of multiplication 
of cosets by numbers. □ 

Theorem 7.5. The factorset V/U of a linear vector space V over a subspace 
U equipped with algebraic operations (7.3) is a linear vector space. This space is 
called the factorspace or the quotient space of the space V over its subspace U. 

PROOF. The proof of this theorem consists in verifying the axioms (l)-(8) of a 
linear vector space for V/U. The commutativity and associativity axioms for the 
operation of coset addition follow from the following calculations: 

di,(vi) + Clu(v 2 ) = Ch(yi + v 2 ) = 

= Cltf(v 2 + vi) = Chj(v 2 ) + Chy(vi), 

(Clc/Cvi) + Cbj(v 2 )) + Chj(v 3 ) = Chj(vi + v 2 ) + Clc/(v 3 ) = 
= Clc/((vi + v 2 ) + v 3 ) = Cl[/(vi + (v 2 + v 3 )) = 
Cltf(vi) + Clc/(v 2 + v 3 ) = Cl[/(vi) + (Clc/(v 2 ) + Clu(v 3 )). 

In essential, they follow from the corresponding axioms for the operation of vector 
addition (see definition 2.1). 

In order to verify the axiom (3) we should have a zero element in V/U. The 
zero coset = Cl[/(0) is the best pretender for this role: 

Cbj(v) + Chy(O) = Cbj(v + 0) = Chy(v). 

In verifying the axiom (4) we should indicate the opposite coset Q' for a coset 
Q = Clc/(v). We define it as follows: Q' = G\u{w'). Then 

Q + Q' = Clu(v) + C\u(V) - Ck/(v + v') - Chj(0) = 0. 

The rest axioms (5)- (8) are verified by direct calculations on the base of formula 
(7.3) for coset operations. Here are these calculations: 

a ■ (Chj(vi) + Chj(v 2 )) = a ■ Chj(vi + v 2 ) = 
= C\u(a ■ (vi + v 2 )) = C1;7(q! ■ vi + a ■ v 2 ) = 
= C\u{a ■ vi) + C\ v (a ■ v 2 ) = a ■ Clt/(vi) + a ■ Cl[/(v 2 ), 

(a + 0)- Clu(v) = Ck((a + /3) ■ v) = C\ v {a ■ v + /3 • v) = 
= C\u{a ■ v) + Ovffi ■ v) = a ■ Chj(v) + /3 ■ Cl[/(v), 

a -((3- Cl[/(v)) = a ■ C\u((3 ■ v) = C\ v (a ■ ((3 ■ v)) = 
= Cl c/ ((a/3)-v) = (a/3)-Cl [/ (v), 

1 • Clu(v) = du(l ■ v) = Cby(v). 
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The above equalities complete the verification of the fact that the factorset V/U 
possesses the structure of a linear vector space. □ 

Note that verifying the axiom (4) we have defined the opposite coset Q' for 
a coset Q = Cby(v) by means of the relationship Q 1 — CLy(v'), where v' is the 
opposite vector for v. One could check the correctness of this definition. However, 
this is not necessary since due to the property (10), see theorem 2.1, the opposite 
coset Q' for Q is unique. 

The concept of factorspace is equally applicable to finite-dimensional and to 
infinite-dimensional spaces V. The finite or infinite dimensionality of a subspacc 
U also makes no difference. The only simplification in finite-dimensional case is 
that we can calculate the dimension of the factorspace V/U. 

Theorem 7.6. If a linear vector space V is Unite-dimensional, then for any 
its subspace U the factorspace V/U also is finite-dimensional and its dimension is 
determined by the following formula: 

dim U + dim(V/I7) = dim V. (7.4) 

PROOF. If U = V then the factorspace V/U consists of zero coset only: 
V/U = {0}. The dimension of such zero space is equal to zero. Hence, the equality 
(7.4) in this trivial case is fulfilled. 

Let's consider a nontrivial case U £ V. Due to the theorem 4.5 the subspace U 
is finite-dimensional. Denote dim V = n and dim?/ = s, then s < n. Let's choose a 
basis ei, . . . , e s in U and, according to the theorem 4.8, complete it with vectors 
e s +i, ... , e„ up to a basis in V. For each of complementary vectors e s+ i, ... , e„ 
we consider the corresponding coset of a subspace U: 

Ei = Clu{e s+1 ), ... , E„_ s = Clu(e n ). (7.5) 

Now let's show that the cosets (7.5) span the factorspace V/U. Indeed, let Q 
be an arbitrary coset in V/U and let v e Q be some representative vector of this 
coset. Let's expand the vector v in the above basis of V: 

v = (ai • ei + . . . + a s ■ e s ) + fa ■ e s+1 + . . . + fi n - s ■ e n . 

Let's denote by u the initial part of this expansion: u = ai • ei + . . . + a s ■ e s . It 
is clear that u e U. Then we can write 

v = u + /?! • e s+1 + . . . + /3„_ s • e„. 

Since u e U, we have Clu(u) = 0. For the coset Q — Cl[/(v) this equality yields 
Q = Pi- CLj(e 5+ i) + . . . + /?„_ s • CLj(e„). Hence, we have 

Q = (3i ■ Ei + . . . + Pn-s ' En-s- 

This means that Ei, ... , E„_ s is a finite spanning system in V/U. Therefore, 
V/U is a finite-dimensional linear vector space. To determine its dimension we 
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shall prove that the cosets (7.5) are linearly independent. Indeed, let's consider a 
linear combination of these cosets being equal to zero: 

71 • Ei + . . . + 7 „_ s • E„_ s = 0. (7.6) 

Passing from cosets to their representative vectors, from (7.6) we derive 

7i • C\u(e s+1 ) + . . . + 7„_ s • Cl[/(e„) = 

= Cl[/(7i • e s+ i + . . . + 7„_ s • e„) = Chy(0). 

Let's denote u = 71 • e s _|_i + . . . + 7„_ s • e„. From the above equality for this vector 
we get Cl[/(u) = Cl[/(0), which means u e U. Let's expand u in the basis of 
subspace U : u = ol\ ■ ei + . . . + a s ■ e s . Then, equating two expression for the 
vector u, we get the following equality: 

-ai ■ ei - . . . - a s ■ e s + 71 • e s+ i + . . . + j n -s ■ e n = 0. 

This is the linear combination of basis vectors of V, which is equal to zero. Basis 
vectors ei, ... ,e„ are linearly independent. Hence, this linear combination is 
trivial and 71 = . . . = 7„_ s = 0. This proves the triviality of linear combination 
(7.6) and, therefore, the linear independence of cosets (7.5). Thus, for the 
dimension of factorspace this yields dim(V/U) = n — s, which proves the equality 
(7.4). The theorem is proved. □ 

§ 8. Linear mappings. 

Definition 8.1. Let V and W be two linear vector spaces over a numeric field 
K. A mapping / : V — > W from the space V to the space W is called a linear 
mapping if the following two conditions are fulfilled: 

(1) /(vi + v 2 ) = /(vi) + /(v 2 ) for any two vectors Vi,v 2 G V; 

(2) f(a ■ v) = a ■ /(v) for any vector v G V and for any number a£l. 

The relationship /(0) = is one of the simplest and immediate consequences of 
the above two properties (1) and (2) of linear mappings. Indeed, we have 

/(0) = f(0 + (-1) • 0) = /(0) + (-1) • /(0) = 0. (8.1) 

Theorem 8.1. Linear mappings possess the following three properties: 

(1) the identical mapping idy : V — > V of a linear vector space V onto itself is 
a linear mapping; 

(2) the composition of any two linear mappings f : V — > W and g: W — > U is 
a linear mapping g ° f : V — > U; 

(3) if a linear mapping / : V — > W is bijective, then the inverse mapping 
f : W — > V also is a linear mapping. 

PROOF. The linearity of the identical mapping is obvious. Indeed, here is the 
verification of the conditions (1) and (2) from the definition 8.1 for idy: 

idy(vi + v 2 ) = vi + v 2 = idy(vi) + idy(v 2 ), 
idy(a • v) = a ■ v = a ■ idy(v). 
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Let's prove the second proposition of the theorem 8.1. Consider the composition 
g° / of two linear mappings / and g. For this composition the conditions (1) and 
(2) from the definition 8.1 are verified as follows: 

5°/(vi + v 2 ) = g(f(vi +v 2 ) = ff(/(vi) + /(v 2 )) = 
= 5(/(vi ) ) + 5 (/(v 2 ) ) = 9 « /(vi ) + g . /(v 2 ) , 

g ° /(« • v) = g(f(a ■ v)) = #(a • /(v)) = a ■ g(f(v)) 
= a-gof(v). 

Now let's prove the third proposition of the theorem 8.1. Suppose that 
/ : V — > W is a bijective linear mapping. Then it possesses unique bilateral 
inverse mapping / _1 : W ^> V (see theorem 1.9). Let's denote 

Zl = /"Vl + W 2) - / _1 (wi) - / _1 (w 2 ), 

z 2 = / _1 (a • w) - a • / _1 (w). 

It is obvious that the linearity of the inverse mapping / is equivalent to 
vanishing zi and z 2 . Let's apply / to these vectors: 

/(zi) = /(/"'(wi + w 2 ) - /-^Wi) - /- 1 (w 2 )) = 

= /(r'(wi + w 2 )) - /(/"'(wi)) - /(/- 1 (w 2 )) = 
= (wi + w 2 ) - wi - w 2 = 0, 

/(z 2 ) = /(/"!(« ■ w) - a ■ rV)) = /(/"V ' w))- 
— a ■ /(/~ (w)) = a ■ w — a ■ w = 0. 

A bijective mapping is injective. Therefore, from the equalities /(zi) = and 
/(z 2 ) = just derived and from the equality /(0) = derived in (8.1) it follows 
that Zi = z 2 = 0. The theorem is proved. □ 

Each linear mapping / : V — > W is related with two subsets: the kernel 
Ker/ C V and the image Im/ C W. The image Im/ = /(V) of a linear mapping 
is defined in the same way as it was done for a general mapping in § 1: 

Im / = {eW: 3 v ((v e A) & (/(v) = w))}. 

The kernel of a linear mapping / : V — > W is the set of vectors in the space V 
that map to zero under the action of /: 

Ker/ = {vey :/(v)=0} 

Theorem 8.2. The kernel and the image of a linear mapping f : V — > W both 
are subspaces in V and W respectively. 

PROOF. In order to prove this theorem we should check the conditions (1) and 
(2) from the definition 2.2 as appliedto the subsets Ker / C V and Im/ C W. 
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Suppose that vi,V2 € Ker/. Then /(vi) = and /(V2) = 0. Suppose also that 

v e Ker/. Then /(v) =0. As a result we derive 

/(vi + v 2 ) = /(vi) + /(v 2 ) =0 + = 0, 
/(a • v) = a ■ /(v) = a • = 0. 

Hence, v x + v 2 € Kcr/ and a ■ v e Ker/. This proves the proposition of the 
theorem concerning the kernel Ker/. 

Let wi,w 2 ,w e Im/. Then there are three vectors Vi,v 2 ,v in V such that 
/(vi) = Wi, /(v 2 ) = w 2 , and /(v) = w. Hence, we have 

wi + w 2 = /(vi) + /(v 2 ) = /(vi + v 2 ), 
a ■ w = a • /(v) = /(a • v). 

This meant that Wi + w 2 £ Im / and a ■ w £ Im /. The theorem is proved. □ 

Remember that, according to the theorem 1.2, a linear mapping /: V — > W is 
surjective if and only if Im / = IT 7 . There is a similar proposition for Ker/. 

Theorem 8.3. A linear mapping / : V — > W is injective if and only if its kernel 
is zero, i. e. Ker / = {0}. 

PROOF. Let / be injective and let v £ Ker/. Then /(0) = and /(v) = 0. 
But if v 7^ 0, then due to injectivity of / it would be /(v) 7^ /(0). Hence, v = 0. 
This means that the kernel of / consists of the only one element: Ker / = {0}. 

Now conversely, suppose that Ker / = {0}. Let's consider two different vectors 

vi 7^ v 2 in V. Then vi — v 2 7^ and vi — v 2 ^ Ker/. Therefore, /(vi — v 2 ) 7^ 0. 
Applying the linearity of /, from this inequality we derive /(vi) — /(v 2 ) 7^ 0, i. c. 
/(vi) 7^ /(v 2 ). Hence, / is an injective mapping. The theorem is proved. □ 

The following theorem is known as the theorem on the linear independence of 
preimages. Here is its statement. 

Theorem 8.4. Let /: V — > W be a linear mapping and let Vi, ... , v s be some 
vectors of a linear vector space V such that their images /(vi), . . . , /(v„) in IT 
are linearly independent. Then the vectors vi, ... , v s themselves are also linearly 
independent. 

Proof. In order to prove the theorem let's consider a linear combination of 
the vectors Vi, ... , v s being equal to zero: 

ct\ ■ vi + . . . + ol s ■ w s = 0. 

Applying / to both sides of this equality and using the fact that / is a linear 
mapping, we obtain quite similar equality for the images 

ai-/(vi) + ... + a s -/(v s ) = 0. 

However, these images /(vi), . . . , /(v„) are linearly independent. Hence, all 
coefficients in the above linear combination are equal to zero: a\ = . . . = a s = 0. 
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Then the initial linear combination is also necessarily trivial. This proves that the 
vectors vi, ... , v s are linearly independent. □ 

A linear vector space is a set. But it is not simply a set — it is a structured 
set. It is equipped with algebraic operations satisfying the axioms (l)-(8). Linear 
mappings are those being concordant with the structures of a linear vector space 
in the spaces they are acting from and to. In algebra such mappings concordant 
with algebraic structures are called morphisms. So, in algebraic terminology, linear 
mappings are morphisms of linear vector spaces. 

Definition 8.2. Two linear vector spaces V and W are called isomorphic if 
there is a bijective linear mapping / : V —* W binding them. 

The first example of an isomorphism of linear vector spaces is the mapping 
ip: V — > IK™ in (5.4). Because of the existence of such mapping we can formulate 
the following theorem. 

Theorem 8.5. Any n-dimensional linear vector space V is isomorphic to the 
arithmetic linear vector space K n . 

Isomorphic linear vector spaces have many common features. Often they can 
be treated as undistinguishable. In particular, we have the following fact. 

Theorem 8.6. If a linear vector space V is isomorphic to a finite-dimensional 
vector space W, then V is also finite-dimensional and the dimensions of these two 
spaces do coincide: dimV = dim IT. 

Proof. Let / : V — > W be an isomorphism of spaces V and W. Assume 
for the sake of certainty that dim IT = n and choose a basis hi, . . . , h„ in W. 
By means of inverse mapping / _1 : W — > V we define the vectors e, = / (hi), 
i = 1, . . . , n. Let v be an arbitrary vector of V. Let's map it with the use of / 
into the space W and then expand in the basis: 

/(v) = ai • hi + . . . + ct n ■ h n . 

Applying the inverse mapping J" 1 to both sides of this equality, due to the 
linearity of / _1 we get the expansion 

v = at ■ ei + . . . + a n ■ e„. 

From this expansion we derive that {ei, . . . , e„} is a finite spanning system in V. 
The finite dimensionality of V is proved. The linear independence of ei, ... , e„ 
follows from the theorem 8.4 on the linear independence of preimages. Hence, 
ei, . . . , e n is a basis in V and diml^ = n = dim IT. The theorem is proved. □ 

§ 9. The matrix of a linear mapping. 

Let / : V — > W be a linear mapping from n-dimcnsional vector space V to 
m-dimensional vector space W. Let's choose a basis ei, . . . , e„ in V and a basis 
hi, ... , h m in W. Then consider the images of basis vectors ei, ... , e„ in W and 
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expand them in the basis hi, ... , h TO : 

/(d) = Fl ■ h\ + ... 



F™ ■ h m , 



/(en) 



Fl-hi + 



(9.1) 



Totally in (9.1) we have n expansions that define nvn numbers Fj. These numbers 
are arranged into a rectangular m x n matrix which is called the matrix of the 
linear mapping f in a pair of bases ei , ... , e„ and hi , ... , h m : 



F 



Fl 

prn 



F 1 



F" 



(9.2) 



When placing the element Fj into the matrix (9.2), the upper index determines 
the row number, while the lower index determines the column number. In other 
words, the matrix F is composed by the column vectors formed by coordinates 
of the vectors /(ei), . . . , /(e„) in the basis hi, . . . , h m . The expansions (9.1), 
which determine the components of this matrix, are convenient to write as follows: 



(9.3) 



Let x be an arbitrary vector of V and let y = /(x) be its image under the 
mapping /. If we expand the vector x in the basis: x = x 1 ■ ei + . . . + x n ■ e„, then, 
taking into account (9.3), for the vector y we get 

n n / m \ 

y = /(x) = £V • fie,) = ' E F i ' h ' 

3 = 1 3=1 \i=l ) 

Changing the order of summations in the above expression, we get the expansion 
of the vector y in the basis hi, ... , h m : 



y = /(x) = E 

i=i \j=i 



h,, 



Due to the uniqueness of such expansion for the coordinates of the vector y in the 
basis hi, ... , h m we get the following formula: 



J2 F 3 

3=1 



(9.4) 



This formula (9.4) is the basic application of the matrix of a linear mapping. It is 
used for calculating the coordinates of the vector /(x) through the coordinates of 
x. In matrix form this formula is written as 



y 1 









Fl 

prn 



F 1 





x 1 




x n 



(9.5) 
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Remember that when composing a column vector of the coordinates of a vector 
x, we negotiated to understand this procedure as a linear mapping ip : V — ► K™ 
(see formulas (5.4) and the theorem 8.5). Denote by ip : W — > K m the analogous 
mapping for a vector y in W. Then the matrix relationship (9.5) can be treated as 
a mapping F: K n — > K m . These three mappings ip, ip,F and the initial mapping 
/ can be written in a diagram: 





/ 


> w 


v- 










K 


n 




> IK 





F 



Such diagrams are called commutative diagrams if the compositions of mappings 
«when passing along arrows* from any node to any other node do not depend on 
a particular path connecting these two nodes. When applied to the diagram (9.6), 
the commutativity means ip ° f = F °ip. Due to the bijectivity of linear mappings 
ip and ip the condition of commutativity of the diagram (9.6) can be written as 



F = ipofoip- 1 , f = ^ 1 oFoip. (9.7) 

The reader can easily check that the relationships (9.7) are fulfilled due to the way 
the matrix F is constructed. Hence, the diagram (9.6) is commutative. 

Now let's look at the relationships (9.7) from a little bit other point of view. Let 
V and W be two spaces of the dimensions n and m respectively. Suppose that we 
have an arbitrary rax n matrix F. Then the relationship (9.5) determines a linear 
mapping F : K n — > IK m . Choosing bases ei, . . . , e n and hi, ... , h m in V and 
W we can use the second relationship (9.7) in order to define the linear mapping 
/: V — > W. The matrix of this mapping in the bases ei, . . . , e n and hi, ... , h m 
coincides with F exactly. Thus, we have proved the following theorem. 

Theorem 9.1. Any rectangular m x n matrix F can be constructed as a ma- 
trix of a linear mapping f : V — > W from n- dimensional vector space V to Tri- 
dimensional vector space W in some pair of bases in these spaces. 

A more straightforward way of proving the theorem 9.1 than we considered 
above can be based on the following theorem. 

Theorem 9.2. For any basis e\, ... , e„ in n-dimensional vector space V and 
for any set of n vectors wi , ... , w„ in another vector space W there is a linear 
mapping f : V — > W such that /(ej) = w, for i = 1, . . . , n. 

Proof. Once the basis ei, . . . , e„ in V is chosen, this defines the mapping 
V — > IK™ (sec (5.4)). In order to construct the required mapping / we define a 
mapping (p: K n — > W by the following relationship: 



ip : 



W1 + ...+X" 



w„ 



Now it is easy to verify that the required mapping is the composition / = ip ° ip. □ 
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Let's return to initial situation. Suppose that we have a mapping / : V —* W 
that determines a matrix F upon choosing two bases ei, ... , e n and hi, ... , h m 
in V and W respectively. The matrix F essentially depends on the choice of bases. 
In order to describe this dependence we consider four bases — two bases in V and 
other two bases in W. Suppose that S and P are direct transition matrices for 
that pairs of bases. Their components are defined as follows: 

n m 

J=l 1=1 

The inverse transition matrices T = S^ 1 and Q = P~ x are defined similarly: 

n m 
k=l r=l 

We use these relationships and the above relationships (9.3) in order to carry out 
the following calculations for the vector /(§*,): 

n n / m \ 

j=l J=l \i=l / 

n / m / m \ \ 

j = l \i=l \r=l ) ) 

Upon changing the order of summations this result is written as 

m / m n \ 
r=l \ i=l j = l / 

The double sums in round brackets are the coefficients of the expansion of the 
vector /(§fe) in the basis hi, ... , h m . They determine the matrix of the linear 
mapping / in wavy bases §i, . . . , e„ and hi, ... , h m : 

m n 

i=i j=i 

In a similar way one can derive the converse relationship expressing F through F: 

m n 
r=l k=l 

The relationships (9.8) and (9.9) are called the transformation formulas for the 
matrix of a linear mapping under a change of bases. They can be written as 

F = P- 1 FS, F = PFS~ 1 . (9.10) 

This is the matrix form of the relationships (9.8) and (9.9). 

The transformation formulas like (9.10) lead us to the broad class of problems 
of «bringing to a canonic form». In our particular case a change of bases in the 
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spaces V and W changes the matrix of the linear mapping / : V — > W. The 
problem of bringing to a canonic form in this case consists in finding the optimal 
choice of bases, where the matrix F has the most simple (canonic) form. The 
following theorem solving this particular problem is known as the theorem on 
bringing to the almost diagonal form. 

Theorem 9.3. Let f : V —* W be some nonzero linear mapping from n- 
dimcnsional vector space V to m- dimensional vector space W. Then there is a 
choice of bases in V and W such that the matrix F of this mapping has the follow- 
ing almost diagonal form: 



1 
1 















1 


















> s 



(9.11) 



PROOF. The purely zero mapping : V — > W maps each vector of the space V 
to zero vector in W. The matrix of such mapping consists of zeros only. There is 
no need to formulate the problem of bringing it to a canonic form. 

Let / : V — ► W be a nonzero linear mapping. The integer number s = dim(Im /) 
is called the rank of the mapping /. The rank of a nonzero mapping is not equal to 
zero. We begin constructing a canonic base in W by choosing a base hi, ... , h s in 
the image space Im /. For each basis vector h^ G Im / there is a vector <G V such 
that /(ej) = hi, i = 1, . . . , s. These vectors ei, ... , e s are linearly independent 
due to the theorem 8.4. Let r = dim(Ker J). We choose a basis in Ker/ and 
denote the basis vectors by e a +i, . . . , e s+r . Then we consider the vectors 

ei, . . . , e s , e g+ i, . . . , e s+r (9-12) 

and prove that they form a basis in V. For this purpose we use the theorem 4.6. 

Let's begin with checking the condition (1) in the theorem 4.6 for the vectors 
(9.12). In order to prove the linear independence of these vectors we consider a 
linear combination of them being equal to zero: 

ai ■ ei + . . . + a s ■ e s + a s+1 ■ e s+1 + . . . + a s+r ■ e s+r = 0. (9.13) 

Let's apply the mapping / to both sides of the equality (9.13) and take into 
account that /(ej) = h, for i — 1, . . . , s. Other vectors belong to the kernel of the 
mapping /, therefore, f(e s+ i) = for i — 1, . . . , r. Then from (9.13) we derive 

ol\ ■ hi + . . . + a s ■ h s = 0. 

The vectors hi, ... , h s form a basis in Im/. They are linearly independent. 
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Hence, ai = . . . = a s = 0. Taking into account this fact, we reduce (9.13) to 

a s +i ■ e s+ i + . . . + a s+r ■ e s+r = 0. 

The vectors e s+1 , . . . , e s+r form a basis in Ker/. They are linearly independent, 
therefore, a s+ i = . . . = a s+r = 0. As a result we have proved that all coefficients 
of the linear combination (9.13) are necessarily zero. Hence, the vectors (9.12) are 
linearly independent. 

Now lets check the second condition of the theorem 4.6 for the vectors (9.12). 
Assume that v is an arbitrary vector in V. Then /(v) belongs to Im/. Let's 
expand /(v) in the basis hi, ... , h s : 

/(v)=ft •h 1 + ... + ft-h s . (9.14) 

Remember that /(ej) = hj for i = 1, ... , s. Then from (9.14) we derive 

= /(v) - ft • /( ei ) - ... - ft • /(e s ) = 

= /(v-ft -ei-...-ft-e 5 ). ( ' j 

Let's denote v = v — ft • ei — . . . — ft • e s . From (9.15) we derive /(v) = for this 
vector v. Hence, v G Ker/. Let's expand v in the basis of Ker/: 

v = ft+i • e s+ i + . . . + ft +r • e s+r . 

From the formula v = v — ft • ei — . . . — ft • e s and the above expansion we get 

v = ft • ei + . . . + ft • e s + ft + i • e s+ i + . . . + ft +r • e s+r . 

This means that the vectors (9.12) form a spanning system in V. The condition 
(2) of the theorem 4.6 for them is also fulfilled. Thus, the vectors (9.12) form a 
basis in V. This yields the equality 

dimV = s + r. (9.16) 

In order to complete the proof of the theorem we need to complete the basis 
hi, . . . , h s of Im/ up to a basis hi, ... , h s , h s+ i, . . . , h m in the space W. For 
the vector /(ej) with j = 1, ... , s we have the expansion 



i— 1 i=s+l 

If j = s + 1, ... , s + r, the expansion for /(e^) is purely zero: 



/(e j )=0 = ^0-h i + ]T 0-h, 



i=s+l 



Due to these expansions the matrix of the mapping / in the bases that we have 
constructed above has the required almost diagonal form (9.10). □ 

In proving this theorem we have proved simultaneously the next one. 
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Theorem 9.4. Let f : V — > W be a linear mapping from n-dimensional space 
V to an arbitrary linear vector space W. Then 

dim(Kcr /) + dim(Im /) = dim V. (9.17) 



This theorem 9.4 is known as the theorem on the sum of dimensions of the 
kernel and the image of a linear mapping. The proposition of the theorem in the 
form of the relationship (9.17) immediately follows from (9.16). 

§ 10. Algebraic operations with mappings. 
The space of homomorphisms Hom(V,VF). 

Definition 10.1. Let V and W be two linear vector spaces and let /: V — > W 
and g : V — > W be two linear mappings from V to W . The linear mapping 
h: V — > W defined by the relationship h(v) = /(v) + g(v), where v is an arbitrary 
vector of V, is called the sum of the mappings / and h. 

Definition 10.2. Let V and W be two linear vector spaces over a numeric 
field K and let / : V — > W be a linear mapping from V to W. The linear mapping 
h: V —* W defined by the relationship h(v) = a ■ /(v), where v is an arbitrary 
vector of V, is called the product of the number a£l and the mapping /. 

The algebraic operations introduced by the definitions 10.1 and 10.2 are called 
pointwise addition and pointwise multiplication by a number. Indeed, they are 
calculated «pointwise» by adding the values of the initial mappings and by 
multiplying them by a number for each specific argument v G V. These operations 
are denoted by the same signs as the corresponding operations with vectors: 
h = f + g and h = a ■ f. The writing (/ + g)(v) is understood as the sum of 
mappings applied to the vector v. Another writing /(v) + g(v) denotes the sum 
of the results of applying / and g to v separately. Though the results of these 
calculations do coincide, their meanings are different. In a similar way one should 
distinguish the meanings of left and right sides of the following equality: 

(a-/)(v)=a-/(v). 

Let's denote by Map(V, W) the set of all mappings from the space V to the 
space W. Sometimes this set is denoted by W v . 

Theorem 10.1. Let V and W be two linear spaces over a numeric Geld K. Then 
the set of mappings Map(V, W) equipped with the operations of pointwise addition 
and pointwise multiplication by numbers fits the definition of a linear vector space 
over the numeric Geld K. 

PROOF. Let's verify the axioms of a linear vector space for the set of mappings 
Map(V, W). In the case of the first axiom we should verify the coincidence of the 
mappings f + g and g + f. Remember that the coincidence of two mappings is 
equivalent to the coincidence of their values when applied to an arbitrary vector 
v G V. The following calculations establish the latter coincidence: 



(/ + 9)M = /(v) + g(v) = g(v) + /(v) = (g + /)(v). 
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As we see in the above calculations, the equality / + g = g + f follows from the 
commutativity axiom for the addition of vectors in W due to pointwise nature of 
the addition of mappings. The same arguments are applicable when verifying the 
axioms (2), (5), and (6) for the algebraic operations with mappings: 

((/ + 9) + h)(y) = (/ + , 9 )(v) + h(v) = (/(v) + g(v)) + h(v) = 
= /(v) + (<?(v) + h{v)) = /(v) + (g + h)(v) = (/ + (.g + fc))(v) 

(«■(/ + ff))(v) = <*■(/ + g)(v) = a ■ (/(v) + g(v)) = 
= a ■ /(v) + a ■ g(v) = (a ■ /)(v) + {a ■ g)(v) = (a ■ f + a ■ g)(v) 

((a + /?) • /)(v) = (a + 0) ■ /(v) = a ■ /(v) + (3 ■ /(v) = 

= (a-/)(v) + (/3-/)(v) = (a-/ + /3-/)(v) 

For the axioms (7) these calculations look like 

(a • (fi ■ /))(v) =«•(/?• /)(v) = a • (/3 • /(v)) - (a/3) • /(v) = ((a/3) • /)(v). 

In the case of the axiom (8) the calculations are even more simple: 

(l./)(v) = l-/(v) = /(v). 

Now let's consider the rest axioms (3) and (4). The zero mapping is the best 
pretender for the role of zero element in the space Map(V, W), it maps each vector 
v e V to zero vector of the space W. For this mapping we have 

(/ + 0)(v) = /(v) + 0(v) = f(v) + = /(v). 

As we see, the axiom (3) in Map(y, W) is fulfilled. 

Suppose that / € Map(y, W). We define the opposite mapping /' for / as 
follows: /' = (—1) ■ /. Then we have 

(/ + /')(v) = (/+(-l)-/)(v) = /(v) + 

+ ((-1) • /)(v) = /(v) + (-1) • /(v) = = 0(v). 

The axiom (4) in Map(V, W) is also fulfilled. This completes the proof of the 
theorem 10.1. □ 

In typical situation the space Map(V, W) is very large. Even for the finite- 
dimensional spaces V and W usually it is an infinite-dimensional space. In linear 
algebra the much smaller subset of Map(V r , W) is studied. This is the set of all 
linear mappings from V to W. It is denoted Hom(V, W) and is called the set 
of homomorphisms. The following two theorems show that Hom(V, W) is closed 
with respect to algebraic operations in Map(V, W). Therefore, we can say that 
Hom(V, W) is the space of homomorphisms. 

Theorem 10.2. The pointwise sum of two linear mappings f : V — > W and 
g: V — > W is a linear mapping from the space V to the space W. 
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Theorem 10.3. The pointwise product of a linear mapping f : V — > W by a 
number a e IK is a linear mapping from the space V to the space W. 

PROOF. Let h — f + g be the sum of two linear mappings / and g. The 
following calculations prove the linearity of the mapping h: 

h(vi + v 2 ) = /(vi + v 2 ) + g(vi + v 2 ) = (/(vi)+ 
+/(v 2 )) + ((3(vi)+5(v 2 ) = (/(vi)+ 

+ ff(vi)) + (/(v 2 ) + ,g(v 2 )) = /i(vi) + /i(v 2 ), 

v) = /(/?■ v) + </(/?■ v) = /?- /(v)+ 

+ /3- 5 (v) = /?-(/(v)+ fl (v) = /3-/i(v). 

Now let's consider the product of the mapping / and the number a. Let's denote 
it by h, i. e. let's denote h = a ■ f. Then the following calculations 

ft(vi + v 2 ) = a ■ /(vi + v 2 ) = a ■ (/(vi)+ 

+ /(v 2 )) = a ■ /(vi) + a • /(v 2 ) = /i(vi) + /i(v 2 ), 

h(J3-v) = a- f(J3-v) = a- (P- f(v)) = 

= (a/3) • /(v) = {(3a) ■ /(v) = /3 • (a • /(v)) = /? • ft(v). 

prove the linearity of the mapping h and thus complete the proofs of both 
theorems 10.2 and 10.3. □ 

The space of homomorphisms Hom(V, W) is a subspace in the space of all 
mappings Map(V,VF). It is much smaller and it consists of objects which are in 
the scope of linear algebra. For finite-dimensional spaces V and W the space of 
homomorphisms Hom(V, W) is also finite-dimensional. This is the result of the 
following theorem. 

Theorem 10.4. For finite- dimensional spaces V and W the space of homomor- 
phisms Hom(V, W) is also finite- dimensional. Its dimension is given by formula 

dim(Hom(V r , W)) = 6xm{V) ■ dim(W). (10.1) 

PROOF. Let dimV = n and dim IT = m. We choose a basis ei, . . . , e n in 
the space V and another basis hi , ... , h m in the space W. Let 1 ^ i n and 
1 < j < m. For each fixed pair of indices i, j within the above ranges we consider 
the following set of n vectors in the space W: 

wi = 0, . . . , Wj_i =0, w, = hj, w l+1 =0, . . . , w„ = 0. 

All vectors in this set are equal to zero, except for the i-th vector W; which is equal 
to j-th basis vector hj. Now we apply the theorem 9.2 to the basis ei, ... , e„ in V 
and to the set of vector wi, ... , w„. This defines the linear mapping E % - : V — > W 
such that Ej(e s ) = w s for all s = 1, . . . , n. We write this fact as 



E i j (e s ) = 6 i s -h j , 



(10.2) 
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where S z s is the Kronecker symbol. As a result we have constructed n ■ m mappings 
Ej satisfying the relationships (10.2): 

E): V -> W, where H i ^ n, Hj^m. (10.3) 

Now we show that the mapping (10.3) span the space of homomorphisms 
Hom(y, W). For this purpose we take a linear mapping / € Hom(V r , W). Suppose 
that F is its matrix in the pair of bases ei , . . . , e„ and hi , ... , h m . Denote by 
Ff the elements of this matrix. Then the result of applying / to an arbitrary 
vector v e V is determined by coordinates of this vector according to the formula 

n n m 

/(v) = £y ■ /(*) = ££(*?' vl ) ■ v ( 10 - 4 ) 

i—\ i—1 j—l 

Applying E l - to the same vector v and taking into account (10.2), we derive 

n n 

W = E vS ■ ^ ( e ») = E( wS ■ ^ = yi ■ h i- ( 10 - 5 ) 

S=l 8=1 

Now, comparing the relationships (10.4) and (10.5), we find 

n m 

»=i j=i 

Since v is an arbitrary vector of the space V, this formula means that / is a linear 
combination of the mappings (10.3): 

n m 

/ EE f - • 

»=i j=i 

Hence, the mappings (10.3) span the space of homomorphisms Hom(y, W). This 
proves the finite-dimensionality of the space Hom(V, W). 

In order to calculate the dimension of Hom(V, W) we shall prove that the 
mappings (10.3) are linearly independent. Let's consider a linear combination of 
these mappings, which is equal to zero: 

n m 

»=i j=i 

Both left and right hand sides of the equality (10.6) represent the zero mapping 
0: V — > W. Let's apply this mapping to the basis vector e s . Then 

n m n m 

i=i j=i i=i j=i 

The sum in the index i can be calculated explicitly. As a result we get the linear 
combinations of basis vectors in W, which are equal to zero: 

m 

E'- h ' °- 
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Due to the linear independence of the vectors hi, ... , h m we derive 7;? = 0. 
This means that the linear combination (10.6) is necessarily trivial. Hence, the 
mappings (10.3) are linearly independent. They form a basis in Hom(V, W). Now, 
by counting these mappings we find that the required formula (10.2) is valid. □ 

The meaning of the above theorem becomes transparent in terms of the matrices 
of linear mappings. Indeed, upon choosing the bases in V and W the linear 
mappings from Hom(y, W) are represented by rectangular m x n matrices. The 
sum of mappings corresponds to the sum of matrices, and the product of a 
mapping by a number corresponds to the product of the matrix by that number. 
Note that rectangular m x n matrices form a linear vector space isomorphic to 
the arithmetic linear vector space K m ™. This space is denoted as K mxn . So, the 
choice of bases in V and W defines an isomorphism of Hom(V, W) and K mx ™. 
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CHAPTER II 
LINEAR OPERATORS. 



§ 1. Linear operators. The algebra of endomorphisms 
End(V) and the group of automorphisms Aut(V). 

A linear mapping / : V — > V acting from a linear vector space V to the same 
vector space V is called a linear operator 1 . Linear operators are special forms of 
linear mappings. Therefore, we can apply to them all results of previous chapter. 
However, the less generality the more specific features. Therefore, the theory of 
linear operators appears to be more rich and more complicated than the theory of 
linear mappings. It contains not only the strengthening of previous theorems for 
this particular case, but a class of problems that cannot be formulated for the case 
of general linear mappings. 

Let's consider the space of homomorphisms Hom(V, W). If W = V, this space 
is called the space of endomorphisms End(V) = Hom(V, V). It consists of linear 
operators / : V — > V which are also called endomorphisms of the space V. Unlike 
the space of homomorphisms Hom(V, W), the space of endomorphisms End(V) is 
equipped with the additional binary algebraic operation. Indeed, if we have two 
linear operators f,g G End(V), we can not only add them and multiply them 
by numbers, but we can also construct two compositions f°g£ End(V) and 
ff»/eEnd(n 

Theorem 1.1. Let End(V) be the space of endomorphisms of a linear vector 
space V. Here, apart from the axioms (l)-(8) of a linear vector space, the following 
relationships are fulhlled: 

(9) (f + g)°h = f°h + g°h; (11) / o (g + h) = /. g + /. h; 

(10) (a-f)oh = a-(foh); (12) /. (a • g) = a ■ {fog); 

PROOF. Each of the equalities (9)-(12) is an operator equality. As we know, 
the equality of two operators means that these operators yield the same result 
when applied to an arbitrary vector v eV: 

(if + 9) ° h){v) = (/ + g)(h(v)) = f(h(v)) + g(h(v)) = 

= (f o h)(v) + (g c h)(v) = (fo h + go h)(v) 

((a.f)°h)(v) = (a.f)(h(v)) = a.f(h(v)) = 

= a-(f°h)(v) = (a-(f°h))(v) 

(/ » (.9 + h))(v) = f((g + h)(v)) = f(g(v) + h(v)) = 

= f(9(v)) + f(h(v)) = (/ o s)(v) + (/ o h)(v) - (/ o g + f c h)(v) 



1 This terminology is not common, however, in this book we strictly follow this terminology. 
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= a ■ /(g(v)) = a ■ (/ c g)(v) = (a ■ (/ a g))(v) 

The above calculations prove the properties (9)-(12) of the composition of linear 
operators. □ 

Let's fix the operator h G End(V^) and consider the composition /o ft, as a rule 
that maps each operator / to the other operator g = fo h. Then we get a mapping: 

R h : End(y) -» End(V). 

The first two properties (9) and (10) from the theorem 1.1 mean that Rh is a 
linear mapping. This mapping is called the right shift by h since it acts as a 
composition, where h is placed on the right side. In a similar way we can define 
another mapping, which is called the left shift by h: 

L h : End(V) -» End(V). 

It acts according to the rule Lh(f) = h ° /. This mapping is linear due to the 
properties (11) and (12) from the theorem 1.1. 

The operation of composition is an additional binary operation in the space 
of endomorphisms End(V). The linearity of the mapping R h is interpreted as 
the linearity of this binary operation in its first argument, while the linearity of 
Lh is said to be the linearity of composition in its second argument. A binary 
algebraic operation linear in both arguments is called a bilinear operation. A 
situation, where a linear vector space is equipped with an additional bilinear 
algebraic operation, is rather typical. 

Definition 1.1. A linear vector space A over a numeric field K equipped with 
a bilinear binary operation of vector multiplication is called an algebra over the 
field K or simply a K- algebra. 

The operation of multiplication in algebras is usually denoted by some sign 
like a dot «.» or a circle «o», but very often this sign is omitted at all. The 
algebra A is called a commutative algebra if the multiplication in it is commutative: 
ab = ba. Similarly, the algebra A is called an associative algebra if the operation 
of multiplication is associative: (ab)c — a (be). 

From the definition 1.1 and from the theorem 1.1 we conclude that the linear 
space End(y) with the operation of composition taken for multiplication is an 
algebra over the same numeric field K as the initial vector space V. This algebra 
is called the algebra of endomorphisms of a linear vector space V. It is associative 
due to the theorem 1.6 from Chapter I. However, this algebra is not commutative 
in general case. 

The operation of composition is treated as a multiplication in the algebra 
of endomorphisms End(V). Therefore, it is usually omitted when written in this 
context. The multiplication of operator is higher priority operation as compared to 
addition. The priority of operator multiplication as compared to the multiplication 
by numbers makes no difference at all. This follows from the axiom (7) for the 
space End(F) and from the properties (10) and (12) of the multiplication in 
End(V). Now we can consider positive integer powers of linear operators: 



f = ff, f = ff, r +1 = rf- 
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If an operator / is bijective, then we have the inverse operator / 1 and we can 
consider negative inverse powers of / as well: 

r 2 = r 1 r\ r r n = p +m - r r- 

The latter equality is valid either for positive and negative values of integer 
constants n and m. 

Definition 1.2. An algebra A over the field K is called an algebra with unit 
element or an algebra with unity if there is an element 1 e A such that 1 • a = a 
and a ■ 1 = a for all a e A. 

The algebra of endomorphisms End(V) is an algebra with unity. The identical 
operator plays the role of unit element in this algebra: 1 = idy. Therefore, this 
operator is also called the unit operator or the operator unity. 

Definition 1.3. A linear operator /: V — > V is called a scalar operator if it is 
obtained by multiplying the unit operator 1 by a number A £ K, i. e. if / = A • 1. 

The basic purpose of operators from the space End(V) is to act upon vectors of 
the space V. Suppose that a, b € End(V) and let xjeF. Then 

(1) (a + 6)(x) =a(x)+6(x); 

(2) a(x + y) =a(x)+a(y). 

These two relationships are well known: the first one follows from the definition of 
the sum of two operators, the second relationship follows from the linearity of the 
operator a. The question is why the vectors x and y in the above formulas are 
surrounded by brackets. This is the consequence of «functional» form of writing 
the action of an operator upon a vector: the operator sign is put on the left and 
the vector sign is put on the right and is enclosed into brackets like an argument 
of a function: w = /(v). Algebraists use the more «deliberate» form of writing: 
w = / v. The operator sign is on the left and the vector sign on the right, but no 
brackets are used. If we know that / € End(V) and v G V, then such a writing 
makes no confusion. In more complicated case even if we know that a £ K, 
/, g G End(F), and veF, the writing w = afgv admits several interpretations: 

w = a ■ /(s(v)), w = (a- /)(ff(v)), 

w = (a-(/o0))(v), w = ((a-/)o(,)(v). 

However, for any one of these interpretations we get the same vector w. Therefore, 
in what follows we shall use the algebraic form of writing the action of an operator 
upon a vector, especially in huge calculations. 

Let / : V — > V be a linear operator in a finite-dimensional vector space V. 
According to general scheme of constructing the matrix of a linear mapping 
we should choose two bases ei, ... , e„ and hi, . . . , h„ in V and consider the 
expansions similar to (9.1) in Chapter I. No doubt that this approach is valid, it 
could be very fruitful in some cases. However, to have two bases in one space — it 
is certainly excessive. Therefore, when constructing the matrix of a linear operator 
the second basis hi, . . . , h„ is chosen to be coinciding with the first one. The 
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matrix F of an operator / is determined from the expansions 
/(d) = Fl-e, + ... + Ff-e„, 

(1-1) 

/(e n ) = fS-ei + ... + F™-e„, 

which can be expressed in brief form by the formula 

n 

/(e J ) = ^Fj-e i . (1.2) 

i=l 

The matrix F determined by the expansions (1.1) or by the expansions (1.2) is 
called the matrix of a linear operator f in the basis ei, ... , e n . This is a square 
n x n matrix, where n = dim V. 

Theorem 1.2. Matrices related to operators / e End(V) in some fixed basis 
ei , . . . , e„ possess the following properties: 

(1) the sum of two operators is represented by the sun of their matrices; 

(2) the product of an operator by a number is represented by the product of its 
matrix by that number; 

(3) the composition of two matrices is represented by the product of their matrices. 

PROOF. Consider the operators /, g, and h from End(V). Let F, G, and H 
be their matrices in the basis ei, ... , e„. Proving the first proposition in the 
theorem 1.2, let's denote h = f + g. Then 

%j) = (/ + g) ej = fie/) + h{ ej ) = 

n n n n 

i— 1 i—1 i—1 i—1 

Due to the uniqueness of the expansion of a vector in a basis we have iJj = F l - + Gj 
and H = F + G. The first proposition of the theorem is proved. 

The proof of the second proposition is similar. Let's denote / — a ■ h. Then 

Mej) = (a- f) ej = a- f( ej ) = 

(n \ n n 

i=i J i=i i=i 

Therefore, Hj = a Fj and H — a ■ F. The proof of the third proposition requires 
a little bit more efforts. Denote h — f ° g. Then 

(n \ n 

H G ) ■e i )=J2G i j .f(e i ) = 
i=l ) i=l 

n / n \ n / n \ n 

i=l \s=l / s=l \ i=l / s=l 
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Due to the uniqueness of the expansion of a vector in a basis we derive 

n 

//; E '•'<'' v 

i=i 

The right side of this equality is easily interpreted as the product of two matrices 
written in terms of the components of these matrices. Therefore, H — F G. The 
theorem is proved. □ 

From the theorem that was proved just above we conclude that when relating an 
operator / e End(V) with its matrix we establish the isomorphism of the algebra 
End(V") and the matrix algebra K" x " with standard matrix multiplication. 

Now let's study how the matrix of a linear operator / : V — > V changes under 
the change of the basis ei, ... , e„ for some other basis §1, ... , e„. Let S be 
the direct transition matrix and let T be the inverse one. Note that we need 
not derive the transformation formulas again. We can adapt the formulas (9.10) 
from Chapter I for our present purpose. Since the basis hi, ... , h„ coincides with 
ei, . . . , e„ and the basis hi, ... , h„ coincides with §i, ... , e„, we have P = S. 
Then transformation formulas are written as 

F = S~ 1 FS, F^SFS- 1 . (1.3) 

These are the required formulas for transforming the matrix of a linear operator 
under a change of basis. Taking into account that T = S^ 1 we can write (1.3) as 

n n n n 

^ = EE^^> ^ = EEW^- (i.4) 

i— 1 j — 1 g— 1 p— 1 

The relationships (1.3) yield very important formula relating the determinants 
of the matrices F and F. Indeed, we have 

dct F = dct(5" 1 ) det F det S = (det Sy 1 dct F dct S = det F. 

The coincidence of determinants of the matrices of a linear operator / in two 
arbitrary bases mean that they represent a number which does not depend on a 
basis at all. 

Definition 1.4. The determinant det / of a linear operator / is the number 
equal to the determinant of the matrix F of this linear operator in some basis. 

A numeric invariant of a geometric object in a linear vector space V is a 
number determined by this geometric object such that it does not depend on 
anything else other than that geometric object itself. The determinant of a linear 
operator det / is an example of such numeric invariant. Coordinates of a vector or 
components of the matrix of a linear operator are not numeric invariants. Another 
example of a numeric invariant of a linear operator is its rank: 

rank / = dim(Im/). 

Soon we shall define a lot of other numeric invariants of a linear operator. 
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From the third proposition of the theorem 1.2 we derive the following formula 
for the determinant of a linear operator: 



det(/o ff ) = det(/)-det( ff ). 



(1.5) 



Theorem 1.3. A linear operator f : V — ► V in a finite-dimensional linear vector 
space V is infective if and only if it is surjective. 

PROOF. In order to prove this theorem we apply the theorem 1.2 and two 
theorems 8.3 and 9.4 from Chapter I. The injectivity of the linear operator / 
is equivalent to the condition Ker/ = {0}, the surjectivity of the operator / 
is equivalent to Im/ = V, while the theorem 9.4 from Chapter I relates the 
dimensions of these two subspaces Ker / and Im /: 

dim(Kcr/) + dim(Im/) = dim(V). 

If the operator / is injective, then Ker / = {0} and dim(Kcr/) = 0. Then 
dim(Im /) = dim(V r ). Applying the third proposition of the theorem 4.5 from 
Chapter I, we get Im / = V, which proves the surjectivity of the operator /. 

Conversely, if the operator / is surjective, then Im/ = V and dim(Im/) = 
dim(y). Hence, dim(Ker/) = and Ker / = {0}. This proves the injectivity of 
the operator /. □ 

Theorem 1.4. A linear operator / : V — > V in a finite-dimensional linear vector 
space V is bijective if and only if det / ^ 0. 

PROOF. Let x be a vector of V and let y = /(x). Expanding x and y in some 
basis ei, . . . , e„, we get the following formula relating their coordinates: 



y 1 




y n 





Ft 



F 1 





x 1 




x n 



(1.6) 



Ft ■ 


F 1 




x 1 







Ff . 


i?n 
x n 




x n 








The formula (1.6) can be derived independently or one can derive it from the 
formula (9.5) of Chapter I. From this formula we derive that x belong to the 
kernel of the operator / if and only if its coordinates x 1 , ... , x n satisfy the 
homogeneous system of linear equations 



(1.7) 



The matrix of this system of equations coincides with the matrix of the operator 
/ in the basis ei, ... , e„. Therefore, the kernel of the operator / is nonzero 
if and only if the system of equations (1.7) has nonzero solution. Here we use 
the well-known result from the theory of determinants: a homogeneous system of 
linear algebraic equations with square matrix F has nonzero solution if and only if 
det F = 0. The proof of this fact can be found in [5]. From this result immediately 
get that the condition Ker / ^ {0} is equivalent to Ker/ = {0}. Due to the 
previous theorem and due to the theorem 1.1 from Chapter I the latter equality 
Ker / ^ {0} is equivalent to bijectivity of /. The theorem is proved. □ 
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An operator / with zero determinant detf — is called a degenerate operator. 
Using this terminology we can formulate the following corollary of the theorem 1.4. 

Corollary. A linear operator f : V — > V in a Unite-dimensional space V has 
a nontrivial kernel Ker/ ^ {0} if and only if it is degenerate. Otherwise this linear 
operator is bijective. 

Remember that a bijective linear mapping / from V to W is called an isomor- 
phism. If W = V such a mapping establishes an isomorphism of the space V with 
itself. Therefore, it is called an automorphism of the space V. The set of all 
automorphisms of the space V is denoted by Aut(V). It is obvious that Aut(V) 
possesses the following properties: 

(1) if f,g g Aut(V), then fo g g Aut(V); 

(2) if / g Aut(V), then f' 1 g Aut(V); 

(3) 1 G Aut(V), where 1 is the identical operator. 

It is easy to see that due to the above three properties the set of automorphisms 
Aut(V) is equipped with a structure of a group. The group of automorphisms 
Aut(V) is a subset in the algebra of endomorphisms End(V), however, it does not 
inherit the structure of an algebra, nor even the structure of a linear vector space. 
It is clear because, for instance, the zero operator does not belong to Aut(V). In 
the case of finite-dimensional space V the group of automorphisms consists of all 
non-degenerate operators. 

§ 2. Projection operators. 

Let V be a linear vector space expanded into a direct sum of two subspaces: 

V = U 1 ®U 2 . (2.1) 
Due to the expansion (2.1) each vector v e V is expanded into a sum 

v = Ui + u 2 , where Ui g U± and u 2 g U2, (2.2) 

the components Ui and u 2 in (2.2) being uniquely determined by the vector v. 

Definition 2.1. The operator P : V — > V mapping each vector v G V to its 
first component Ui in the expansion (2.2) is called the operator of projection onto 
the subspace U\ parallel to the subspace L^- 

Theorem 2.1. For any expansion of the form (2.1) the operator of projection 
onto the subspace U\ parallel to the subspace U2 is a linear operator. 

PROOF. Let's consider a pair of vectors vi,v 2 from the space V, and for each 
of them consider the expansion like (2.2): 

vi = ui + u 2 , 
V 2 = Ui + u 2 . 

Then P(vi) = Ui and P(v 2 ) = ui. Let's add the above two expansions and write 
vi + v 2 = (ui + iii) + (u 2 + u 2 ). (2.3) 
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From ui, ui G Ui and from 112, U2 G £^2 we derive ui + ui G ?7i and U2 + U2 £ 
Therefore, (2.3) is an expansion of the form (2.2) for the vector vi + V2. Then 

P(vi + v 2 ) = Ul + U! = P(vi) + P(v 2 ). (2.4) 

Now let's consider the expansion (2.2) for an arbitrary vector v G V and 
multiply it by a number a G K: 

a • v = (a ■ Ui) + (a ■ 112). 

Then a ■ Ui G ?7i and a • U2 G U2, therefore, due to the definition of P we get 

P(a ■ v) = a ■ ui = a ■ P(v). (2.5) 

The relationships (2.4) and (2.5) are just the very relationships that mean the 
linearity of the operator P. □ 

Suppose that v in the expansion (2.2) is chosen to be a vector of the subspace 
U\. Then the expansion (2.2) for this vector is v = v + 0, therefore, P(v) = v. 
This means that all vectors of the subspace U\ are projected by P onto themselves. 
This fact has an important consequence P 2 = P. Indeed, for any v G V we have 
P(v) G Ui, therefore, P(P(v)) = P(v). 

Besides P, by means of (2.2) we can define the other operator Q such that 
Q(v) = U2. It is also a projection operator: it projects onto U2 parallel to U\. 
Therefore, Q 2 = Q. For the sum of these two operators we get P + Q = 1. Indeed, 
for any vector v G V we have 

P(v) + Q(v) = ui + u 2 = v = idy(v) = l(v). 

If v G Ui, then the expansion (2.2) for this vector is v = +0, therefore, 
Q(v) = 0. Similarly, P(v) = for all v G U 2 - Hence, we derive Q(P(v)) = and 
P(Q(v)) = for any v G V. Summarizing these results, we write 

P l = P - P + Q ^' (2.6) 

Q 2 = Q, PQ = QP = 0. 

A pair of projection operators satisfying the relationships (2.6) is called a con- 
cordant pair of projectors. 

in order to get a concordant pair of projectors it is sufficient to define only one 
of them, for instance, the operator P. The second operator Q then is given by 
formula Q = 1 — P. All of the relationships (2.6) thereby will be automatically 
fulfilled. Indeed, we have the relationships 

PQ = Po(l-P) = P-P 2 = P-P = 0, 
Q P = (1 - P) o P = P - P 2 = P - P = Q. 

The relationship Q 2 = Q for Q is derived in a similar way: 



Q 2 = (1 - P) o (I - P) = 1 - 2P + P = 1 - P = Q. 
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Theorem 2.2. An operator P: V — ► V is a projector onto a subspace parallel 
to another subspace if and only if P 2 = P. 

PROOF. We have already shown that any projector satisfies the equality P 2 = 
P. Let's prove the converse proposition. Suppose that P 2 = P. Let's denote 
Q = 1 — P. Then for operators P and Q all of the relationships (2.6) are fulfilled. 
Let's consider two subspaces 

C/i=ImP, [/ 2 =KerP. 

For an arbitrary vector v e V we have the expansion 

v = l(v) = (P + Q)v = P(v) + Q(v), (2.7) 

where U! = P(v) € Im P. From the relationship P Q = for the other vector 
u 2 = Q( v ) in (2-7) we get the equality 

P(u 2 ) = P(Q(v)) = 0. 

This means u 2 € KerP. Hence, V = ImP + KcrP. Let's prove that this is a 
direct sum of subspaces. We should prove the uniqueness of the expansion 

v = ui+u 2 , (2.8) 

where Ui € ImP and u 2 £ KerP. From Ui e ImP we conclude that Ui = P(vi) 
for some vector vi e V. From u 2 e KerP we derive -P(u 2 ) = 0. Then from (2.8) 
we derive the following formulas: 

P(v) = P(ui) + P(u 2 ) = P(P( Vl )) = P 2 ( Vl ) = P(vi) - ui, 
Q(v) = (1 - P) v = v - P(v) = v - ui = u 2 . 

The relationships derives just above mean that any expansion (2.8) coincides with 
(2.7). Hence, it is unique and we have 

V = Im P Ker P. 

The operator P maps an arbitrary vector v G V into the first component of the 
expansion (2.8). Hence, P is an operator of projection onto the subspace ImP 
parallel to the subspace Ker P. □ 

Now suppose that a linear vector space V is expanded into the direct sum of 
several its subspaces U\, ... , U s : 

V = U 1 @ ...@U S . (2.9) 

This expansion of the space V implies the unique expansion for each vector v e V: 

v = Ui + ... + u s , where Uj € Ui (2.10) 
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Definition 2.2. The operator Pi : V —* V that maps each vector v e V to its 
i-th component in the expansion (2.10) is called the operator of projection onto 
Ui parallel to other subspaces. 

The proof of linearity of the operators P, is practically the same as in case of 
two subspaces considered in theorem 2.1. It is based on the uniqueness of the 
expansion (2.10). 

Let's choose a vector u e ?7j. Then its expansion (2.10) looks like: 

u = + ... + + u + + ... + 0. 

Therefore for any such vector u we have Pj(u) = u and Pj(u) = for j ^ i. For 
the projection operators Pi this yields 

{Pif = Pu Pi °Pj=0 for ijtj. (2.11) 

Moreover, from the definition of Pi we get 

Pi + ... + P. = l. (2.12) 

Due to the first relationship (2.11) the theory of separate operators Pi does 
not differ from the theory of projectors defined by two component expansions of 
the space V. In the case of multicomponent expansions the collective behavior of 
projectors is of particular interest. A family of projection operators Pi, ... , P s is 
called a concordant family of projectors if the operators of this family satisfy the 
relationships (2.11) and (2.12). 

Theorem 2.3. A family of projection operators Pi, ... , P s is determined by 
an expansion of the form (2.9) if and only if it is concordant, i. e. if these operators 
satisfy the relationships (2.11) and (2.12). 

PROOF. We already know that a family of projectors determined by an ex- 
pansion (2.9) satisfy the relationships (2.11) and (2.12). Let's prove the converse 
proposition. Suppose that we have a family of operators Pi, ... , P s satisfying the 
relationships (2.11) and (2.12). Then we define the subspaces Ui = ImP^. Due to 
the relationship (2.12) for an arbitrary vector v e V we get 

v = Pi(v) + ... + P s (v), (2.13) 

where P»(v) G ImP^. Hence, we have the expansion of V into a sum of subspaces 

K = ImPi + ... + ImP s . (2.14) 

Let's prove that the sum (2.14) is a direct sum. For this purpose we consider an 
expansion of some arbitrary vector veF corresponding to the expansion (2.14): 



v = ui + . . . + u s , where u t e Im p 



(2.15) 
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From U{ e ImPi we conclude that = P(v^), where v; e Then from the 
expansion (2.15) we derive the following equality: 

s 

P(v) = Pi(u! + . . . + u s ) = Pi(Pj(^))- 

Due to (2.11) only one term in the above sum is nonzero. Therefore, we have 

Pi(v) = (P) 2 v 4 =Pi(v<) =Uj. 

This equality show that an arbitrary expansion (2.15) should coincide with (2.13). 
This means that (2.13) is the unique expansion of that sort. Hence, the sum (2.14) 
is a direct sum and P t is the projection operator onto the i-th component of the 
sum (2.14) parallel to its other components. The theorem is proved. □ 

Now we consider a projection operator P as an example for the first approach 
to the problem of bringing the matrix of a linear operator to a canonic form. 

Theorem 2.4. For any nonzero projection operator in a finite-dimensional 
vector space V there is a basis e\, . . . , e„ such that the matrix of the operator 
P has the following form in that basis: 



1 
1 















1 




... 



> s 



(2.16) 



PROOF. Let's consider the subspaces ImP and KerP. From the condition 
P 7^ we conclude that s = dim(ImP) ^ 0. Then we choose a basis ei, ... , e s 
in U\ = ImP and if U\ ^ V, we complete it by choosing a basis in e s+ i, ... , e„ 
in U2 = KerP. The sum of these two subspaces is a direct sum: V — U\ U2, 
therefore, joining together two bases in them, we get a basis of V (see the proof of 
theorem 6.3 in Chapter I). 

Now let's apply the operator P to the vectors of the basis we have constructed 
just above. This operator projects onto U\ parallel to [7 2 , therefore, we have 



ei for i = 1, . . . , s, 
for i = s + 1, . . . ,n. 



Due to this formula it's clear that (2.16) is the matrix of the projection operator 
P the basis ei, . . . , e s . □ 
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§ 3. Invariant subspaces. 
Restriction and factorization of operators. 



Let / : V — > V be a linear operator and let U be a subspace of V. Let's restrict 
the domain of / to the subspace U. Thereby the image of / shrinks to /(£/)• 
However, in general, the subspace f(U) is not enclosed into the subspace U. For 
this reason in general case we should treat the restricted operator / as a linear 
mapping f u '-U — > V, rather than a linear operator. 

Definition 3.1. A subspace U is called an invariant subspace of a linear 
operator / : V — » V if f(U) C U, i. e. if ueU implies /(u) E U. 

If U is an invariant subspace of /, the restriction / can be treated as a linear 
operator in U. Its action upon vectors u e U coincides with the action of / upon 
u. As for the vectors outside the subspace U, the operator / cannot be applied 
to them at all. 

Theorem 3.1. The kernel and the image of a linear operator f : V — > V arc 
invariant subspaces of f. 

PROOF. Let's consider the kernel of / for the first. If u e Ker/, then /(u) = 0. 
Hence, /(u) € U, since the zero vector is an element of any subspace of V. The 
invariance of the kernel Kerf is proved. 

Now let u e Im/. Denote w = /(u). Then w is the image of the vector u, 
hence, w = /(u) € Im/. The invariance of the image Im / is proved. □ 

Theorem 3.2. The intersection and the sum of an arbitrary number of invariant 
subspaces of a linear operator / : V — > V both are the invariant subspaces of f. 

Proof. Let Ui, i s I be a family of invariant subspaces of a linear operator 
/: V — > V. Let's consider the intersection and the sum of these subspaces: 



In § 6 of Chapter I we have proved that U and W are the subspaces of V. Now 
we should prove that they are invariant subspaces. For the first, let's prove that 
U is an invariant subspace. Consider a vector u e U. This vector belongs to all 
subspaces Ui, which are invariant subspaces of /. Therefore, /(u) also belongs 
to all subspaces Ui. This means that /(u) belongs to their intersection U. The 
invariance of U is proved. 

Now let's consider a vector w £ W. According to the definition of the sum of 
subspaces, this vector admits the expansion 





w = Ujj + . . . + u is , , where u lr e U lr . 



Applying the operator / to both sides of this equality, we get: 



/(W)=/K) + ... + /(!*.). 



Due to the invariance of Ui we have f{u ir ) e Uj r . Hence, f(w) G W. This yields 
the invariance of the sum W of the invariant subspaces Ui. □ 
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Let U be an invariant subspace of a linear operator / : V — > V. Let's consider 
the factorspace V/U and define the operator f v , v in this factorspace by formula 

f v/u (Q) = Cl[/(/(v)), where Q = Cby(v). (3.1) 

The operator /y,^: V/U — > V/t/ acting according to the rule (3.1) is called the 
factoroperator of the quotient operator of the operator / by the subspace U. We 
can rewrite the formula (3.1) in shorter form as follows: 

f v/u (C\uM) = CM/(v)). (3-2) 

Like formulas (7.3) in Chapter I, the formulas (3.1) and (3.2) comprise the definite 
amount of uncertainty due to the uncertainty of the choice of a representative v 
in a coset Q = Clj/(v). Therefore, we need to prove their correctness. 

Theorem 3.3. The formula (3.1) and the equivalent formula (3.2) both are 
correct. They define a linear operator f v ^ v in factorspace V/U. 

PROOF. Let's conside two different representative vectors in a coset Q, i. e. let 
v, v G Q. Then v — v e U. According to the formula (3.1), we consider two 
possible results of applying the operator f v / v to Q: 

f v/u (Q) = Glu(f(v)), f v/u (Q) = Clt,(/(v)). 

Let's calculate the difference of these two possible results: 

Clt/(/(v)) - CV(/(v)) - Clt,(/(v) - /(v)) = Clc/(/(v - v)). 

Note that the vector u = v — v belongs to the subspace U. Since U is an invariant 
subspace, we have u = /(u) e U. Therefore, we get 

Cl [7 (/(v))-Cl [ ,(/(v)) = Cl (/ (u)=0. 

This coincidence Cl(y(/(v)) = Cl[/(/(v)) that we have proved just above proves 
the correctness of the formula (3.1) and the formula (3.2) as well. 

Now let's prove the linearity of the factoroperator f v ,jj- V/U — > V/U. We shall 
carry out the appropriate calculations on the base of formula (3.1): 

f v/u (Qi + Q 2 ) - f v/u {CLu(vi) + CV(v 2 )) = 
= /^(Cl^vr + v 2 )) = C^(/(vi + v 2 )) = 

= ci^(/(vi)) + cm/(v 2 )) = f v/u (Qi) + f v/u (Q2), 
f v/u ( a ■ Q) = f v/u ( a ■ cic/(v)) = 

v/u (Chi 

- C\ v (a ■ /(v)) - a • CM/tv)) - a • f v/u {Q). 



= f v/u (Cb(a-v))=®u(f(a-v)) 
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These calculations show that f v i v is a linear operator. The theorem is proved. □ 

Theorem 3.4. Suppose that U is a common invariant subspace of two linear 
operators f, g € End(V). Then U is an invariant subspace of the operators f+g, ct-f 
and f ° g as well. For their restrictions to the subspace U and for the corresponding 
factoroperators we have the following relationships: 

if + g) u = f u + 9 u ; (/ + g) v/u = f v/u + g v/u ; 

(a ■ f) v = a ■ f v ; (a ■ f) v/u = a ■ f y/u ; 

(/ ° g)u = fu ° g v ; (/ ° g) v/u = f v/u ° g v/u ■ 



Proof. Let's begin with the first case. Denote h = f + g and assume that 
u is an arbitrary vector of U. Then /(u) G U and g(u) e U since U is 
an invariant subspace of both operators / and g. For this reason we obtain 
h(u) = /(u) + g(u) e U. This proves that U is an invariant subspace of h. The 
relationship h (J = f + g v follows from h = f + g since the results of applying the 
restricted operators to u do not differ from the results of applying /, g, and h to 
u. The corresponding relationship for the factoroperators is proved as follows: 

h v/u (G\u{v)) = G\u(h(v)) = Ch/Cf (v) + h(v)) = 

= CM/(v)) + Clu(g(v)) = f v/u (Clu(-v)) + g v/u {G\u(v)). 

The second case, where we denote h = a - f, is not quite different from the first 
one. From u e U it follows that /(it) <G U, hence, h(u) = a ■ /(u) e U . The 
relationship /i = a ■ f v now is obvious due to the same reasons as above. For the 
factoroperators we perform the following calculations: 

h v/u (Clu(v)) = C\u(h(v)) = C\u(a ■ /(v)) = 
= a ■ CbtfW) = a ■ f v/u (®v(v)) = (a • f v/u )(Cki(v)). 

Now we consider the third case. Here we denote h — f ° g. From u e U we 
derive w = g(u) G U, then from w G U wc derive /(w) € C/, which means that 
U is an invariant subspace of h. Indeed, h(u) = f(g(u)) = /(w) e U. For the 
restricted operators this yields the equality 

^(u) = h(u) = f(g(u)) = f D (g p (u)). 

Hence, h [f — f ° g u . Passing to factoroperators, we obtain 

h v/u (C\u(v)) = Clu(h(v)) = C\u(f(g(v)) = / y/[/ (Cl( 5 (v))) = 

The above calculations prove the last relationship of the theorem 3.4. □ 

Theorem 3.5. Let V = U\ © . . . © U s be an expansion of a linear vector space 
V into a direct sum of its subspaces. The subspaces U\, . . . , U s are invariant sub- 
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spaces of an operator f: V — > V if and only if the projection operators Pi, ... , P a 
associated with the expansion V = U\ © . . . © U s commute with the operator f, i. e. 
if f "Pi = Pi ° /, where i = 1, . . . , s. 

Proof. Suppose that all subspaces U% are invariant under the action of the 
operator /. For an arbitrary vector v G V we consider the expansion determined 
by the direct sum V = Ui © . . . © U s : 

V = Ul + . . . + u s . 

Here = Pj(v) G Ui. From this expansion we derive 

P(/(v)) = Pi(/(ui) + . . . + /(u s )) = /(in) = /(P(v)) 

We used the inclusion Wj = f(uj) G Uj that follows from the invariance of the 
subspacc {/j under the action of /. We also used the following properties of 
projection operators (they follow from (2.11) and U = ImPj, sec § 2 above): 



P(w 3 ) = 




for j = i, 
for j 7^ i. 



Since v is an arbitrary vector of the space V, from the above equality Pj(/(v)) = 
/(P;(v)) we derive / o P< = P< . /. 

Conversely, suppose that the operator / commute with all projection operators 
Pi , ... , P s associated with the expansion V = f/i © . . . © U s . Let u be an arbitrary 
vector of the subspace Ui. Then we denote w = /(u) and for w we derive 

P(w) = P(/(v)) = /(P(u)) = /(u) = w. 

Remember that Pi projects onto the subspace Ui. Hence, P»(w) € £/,. But due 
to the above equality we find that P»(w) = w = /(u) G C/,. Thus we have shown 
that the space [7* is invariant under the action of the operator /. The theorem is 
completely proved. □ 

Let's consider a linear operator / in a finite-dimensional linear vector space V 
and possessing an invariant subspace U. Suppose that dimV = n and dim?/ = s. 
Let's choose a basis ei, . . . , e s in U and then, if s < n, complete this basis up 
to a basis in V. Denote by e s +i, ... , e„ the complementary vectors. For j ^ s 
due to the invariance of the subspace U under the action of / we have f(ej) G U. 
Therefore, in the expansions of these vectors 

s 

f( e j) = ^2 F j ■ where j ^ s, 

i=l 

the summation index i runs from 1 to s, but not from 1 to n as it should in 
general case, where we expand an arbitrary vector of V. This means that if we 
construct the matrix of the operator / in the basis ei, ... , e„, this matrix would 
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be mounted of blocks with the lower left block in it being zero: 



s 



F = 



F 1 F 1 



F? 



n n 






F 1 F 1 



s+l 



F A 

^ S 






s+l 



F: 



+i 



*s+l 



F 1 

n 

F 2 

n 

F s 

TPs+1 
n 

pn 



> S 



(3.3) 



Matrices of this form are called blockwise-triangular matrices. The upper left 
diagonal block in the matrix (3.3) coincides with the matrix of restricted operator 
/ : U — > U in the invariant subspace U. 

The lower right diagonal block of the matrix (3.3) can also be interpreted in 
a special way. In order to find this interpretation let's consider the cosets of 
complementary vectors in the basis ei, ... , e„: 



Ei = Cbj(e s+1 ), 



- Clu(e n ). 



(3.4) 



When proving the theorem 7.6 in Chapter I, we have found that these cosets form 
a basis in the factorspace V/U. Applying the factoroperator f v / v to (3.4), we get 

fv/u ( e j) = fv/u Cl u( e s+j) = Clc/(/(e s+j )) = 

s n 



»=i 



=8+1 



The first sum in the above expression is equal to zero since the vectors ei, ... , e s 
belong to U. Then, shifting the index i + s — > i, we find 



•s+i pi 
s+j ' 



Looking at this formula, we see that the matrix of the factoroperator f v / v in the 
basis (3.4) coincides with the lower right diagonal block in the matrix (3.3). 

Theorem 3.6. Let f: V — > V be a linear operator in a finite-dimensional space 
and let U be an invariant subspace of this operator. Then the determinant of f 
is equal to the product of two determinants — the determinant of the restricted 
operator f v and that of the factoroperator f v ^ v •' 



det/ = det(/ c/ )-det(/ v/c/ ). 



The proof of this theorem is immediate from the following fact well-known 
in the theory of determinants: the determinant of blockwise-triangular matrix is 
equal to the product of determinants of all its diagonal blocks. 
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§ 4. Eigenvalues and eigenvectors. 



Let / : V — > V be a linear operator. A nonzero vector v ^ of the space V is 
called an eigenvector of the operator / if / v = A • v, where Ael. The number A 
is called the eigenvalue of the operator / associated with the eigenvector v. 

One eigenvalue A of an operator / can be associated with several or even with 
infinite number of eigenvectors. But conversely, if an eigenvector is given, the 
associated eigenvalue A for this eigenvector is unique. Indeed, from the equality 
/ v = A • v = A' • v and from v ^ it follows that A = A'. 

Let v be an eigenvector of the operator / : V — > V. Let's consider the other 
operator h\ = f — A • 1. Then the equation / v = A • v can be rewritten as 



Hence, v S Ker(/ — A • 1). The condition v ^ means that the kernel of this 
operator is nonzero: Ker(/ — A • 1) ^ {0}. 

Definition 4.1. A number A e K is called an eigenvalue of a linear operator 
/ : V — > V if the subspace V\ — Ker(/ — A • 1) is nonzero. This subspace 
V\ = Ker(/ — A • 1) {0} is called the eigenspace associated with the eigenvalue A, 
while any nonzero vector of V\ is called an eigenvector of the operator / associated 
with the eigenvalue A. 

The collection of all eigenvalues of an operator / is sometimes called the 
spectrum of this operator, while the brunch of mathematics studying the spectra of 
linear operators is known as the spectral theory of operators. The spectral theory 
of linear operators in finite-dimensional spaces is the most simple one. This is the 
very theory that is usually studied in the course of linear algebra. 

Let / : V — > V be a linear operator in a finite-dimensional linear vector space 
V. In order to find the spectrum of this operator we apply the corollary of 
theorem 1.4. Due to this corollary a number A G K is an eigenvalue of the operator 
/ if and only if it satisfies the equation 



The equation (4.2) is called the characteristic equation of the operator /, its roots 
are called the characteristic numbers of the operator /. 

Let dimV = n. Then the determinant in formula (4.2) is equal to the 
determinant of the square n x n matrix. The matrix of the operator h\ — f — A • 1 
is derived from the matrix of the operator / by subtracting A from each element 
on the primary diagonal of this matrix: 



(/-A-l)v = 0. 



(4.1) 



det(/ - A • 1) = 0. 



(4.2) 




A F\ 

n- 



X 




(4.3) 
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The determinant of the matrix (4.3) is a polynomial of A: 

det(/ - A • 1) = (-A)" + Fi (-A)"- 1 +... + F n . (4.4) 

The polynomial in right hand side of (4.4) is called the characteristic polynomial 
of the operator /. If F is the the matrix of the operator / in some basis, then the 
coefficients F\, ... , F n of characteristic polynomial (4.4) are expressed through 
the elements of the matrix F. However, note that left hand side of of (4.4) is basis 
independent, therefore, the coefficients . . . , F n do not actually depend on the 
choice of basis. They are scalar invariants of the operator /. The fires and the last 
invariants in (4.4) are the most popular ones: 

F 1= tr/, F n = detf. 

The invariant F\ is called the trace of the operator /. It is calculated through the 
matrix of this operator according to the following formula: 

n 

tvf = Y,Ft- (4-5) 

i=i 

We shall not derive this formula (4.5) since it is well-known in the theory of 
determinants. Wc shall only derive the invariance of the trace immediately on the 
base of formula (1.4) which describes the transformation of the matrix of a linear 
operator under a change of basis: 

n n n / n \ n n n 

p— 1 i— 1 j — 1 \ p— 1 / z— 1 j — 1 z— 1 

Upon substituting (4.4) into (4.2) we see that the characteristic equation (4.2) 
of the operator / is a polynomial equation of n-th order with respect to A: 

(-A) n + F 1 (-A)"- 1 + . . . + F n = 0. (4.6) 

Therefore we can estimate the number of eigenvalues of the operator /. Any 
eigenvalue A £ K is a root of characteristic equation (4.6). However, not any 
root of the equation (4.6) is an eigenvalue of the operator /. The matter is that 
a polynomial equation with coefficients in the numeric field IK can have roots in 
some larger field K (e. g. Q C K or 1 C C). For the characteristic number A of the 
operator / to be an eigenvalue of this operator it should belong to K. From the 
course of general algebra we know that the total number of roots of the equation 
(4.6) counted according to their multiplicity and including those belonging to the 
extensions of the field K is equal to n (see [4]). 

Theorem 4.1. The number of eigenvalues of a linear operator f : V — ► V equals 
to the dimension of the space V at most. 

Consider the case K = Q. The roots of a polynomial equation with rational 
coefficients are not necessarily rational numbers: the equation A 2 — 3 = is an 
example. In the case of real numbers K = R a polynomial equation with real 
coefficients can also have non-real roots, e.g. the equation A 2 + \/3 = 0. However, 
the field of complex numbers K = C is an exception. 
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Theorem 4.2. An arbitrary polynomial equation of n-th order with complex 
coefficients has exactly n complex roots counted according to their multiplicity. 

We shall not prove here this theorem referring the reader to the course of 
general algebra (see [4]). The theorem 4.2 is known as the «basic theorem of 
algebra* , while the property of complex numbers stated in this theorem is called 
the algebraic closure of C, i. c. C is an algebraically closed numeric field. 

Definition 4.2. A numeric field K is called an algebraically closed field if the 
roots of any polynomial equation with coefficients from K are again in K. 

Certainly, C is not the unique algebraically closed field. However, in the list of 
numeric fields Q, R, C that we consider in this book, only the field of complex 
numbers is algebraically closed. 

Let A be an eigenvalue of a linear operator /. Then A is a root of the equation 
(4.6). The multiplicity of this root A in the equation (4.6) is called the multiplicity 
of the eigenvalue A. 

Theorem 4.3. For a linear operator / : V — > V in a complex linear vector space 
V the number of its eigenvalues counted according to their multiplicities is exactly 
equal to the dimension ofV. 

This proposition strengthen the theorem 4.1. It is an immediate consequence 
of the algebraic closure of the field of complex numbers C. In the case K = C the 
characteristic polynomial (4.4) is factorized into a product of terms linear in A: 

n 

det(/-A-l) = JJ(Ai-A). (4.7) 

i=l 

For some operators such an expansion can occur in the case K = Q or IK = R, 
however, it is not a typical situation. If Ai, . . . , A„ are understood as characteristic 
numbers of the operator /, then the formula (4.7) is always valid. 

Due to the formula (4.7) we can present the numeric invariants ... , F n of 
the operator / as elementary symmetric polynomials of its characteristic numbers: 

F i = o- i {\ u ... ,A„). 

In particular, for the trace and for the determinant of the operator / we have 

n n 

tr/ = 5^A i) det/ = J]A i . (4.8) 

i=l i=l 

The theory of symmetric polynomials is given in the course of general algebra (see, 
for example, the book [4]). 

Theorem 4.4. For any eigenvalue X of a linear operator f : V — > V the asso- 
ciated eigenspace V\ is invariant under the action of /. 

Proof. The definition 4.1 of an eigenspace V\ of a linear operator / can 
be reformulated as V\ = {v £ V : /(v) = A • v}. Therefore, v £ V\ implies 
/(v) = A • v e V\, which proves the invariance of V\. □ 
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We know that the set of linear operators in a space V form the algebra End(V) 
over the numeric field K. However, this algebra is too big. Let's consider some 
operator / e End(V) and complement it with the identical operator 1. Within 
the algebra End(V) we can take positive integer powers of the operator /, we can 
multiply them by numbers from K, we can add such products, and we can add to 
them scalar operators obtained by multiplying the identical operator 1 by various 
numbers from K. As a result we obtain various operators of the form 

P(f)=<*r- f + ... + ai- f + ao-1. (4.9) 

The set of all operators of the form (4.9) is called the polynomial envelope of the 
operator /; it is denoted K[f}. This is a subset of End(V) closed with respect to 
all algebraic operations in End(F). Such subsets are used to be called subalgebras. 
It is important to say that the subalgebra K[/] is commutative, i. e. for any two 
polynomials P and Q the corresponding operators (4.9) commute: 

P{f)°QU) = Q{f)°P{f)- (4.io) 

The equality (4.10) is verified by direct calculation. Indeed, let P(f) and Q(f) be 
two operator polynomials of the form: 

p(/) = Q(/) = !>■/''• 

i=0 j=0 

Here we denote: /° = 1. This relationship should be treated as the definition of 
zeroth power of the operator /. Then 

p(f) - qu) = E Eteft) • ^ = w) ° 

i=o j=a 

These calculations prove the relationship (4.10). 

Theorem 4.5. Let U be an invariant subspace of an operator f. Then it is 
invariant under the action of any operator from the polynomial envelope K[/]. 

Proof. Let u be an arbitrary vector of U. Let's consider the following 
vectors u — u, Ui = /(u), u 2 = / 2 (u), . . . , u p = / p (u). Every next vector in this 
sequence is obtained by applying the operator / to the previous one: Uj+i = /(iij). 
Therefore, from uo € U it follows that Ui e U since U is an invariant subspace of 
/. Then, in turn, we successively obtain u 2 £ U, 113 € U, and so on up to u p e U. 
Applying the operator P(f) of the form (4.9) to the vector u, we get 

P(f) u = a p ■ u p + . . . + a x ■ Ui + a ■ u . 

Hence, due to Ui e U we find that P(f) u e U, which proves the invariance of U 
under the action of the operator P(f). □ 

The following fact is curious: if A is an eigenvalue of the operator / and if v is 
an associated eigenvector, then P(f) v = P(X) ■ v. Therefore, any eigenvector v of 
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the operator / is an eigenvector of the operator P(f). The converse proposition, 
however, is not true. 

Let Ai, . . . , A s be a set of mutually distinct eigenvalues of the operator /. Let's 
consider the operators hi = / — Aj • 1, which certainly belong to the polynomial 
envelope of /. The permutability of any two such operators follows from (4.10). 
The eigenspace V\. of the operator / is determined as the kernel of the operator 
hi. According to the definition 4.1, it is nonzero. Moreover, the theorems 4.4 and 
4.5 say that V\. is invariant under the action of / and of all other operators hj. 

Theorem 4.6. Let Ai, . . . , X s be a set of mutually distinct eigenvalues of the 
operator f : V — ► V. Then the sum of associated eigenspaces V\ 1 , . . . , V\ s is a 
direct sum: V\ 1 + . . . + V\ a = V\ 1 ... V\ s . 

Note that the set of mutually distinct eigenvalues Ai, . . . , A s of the operator / 
in this theorem could be the complete set of such eigenvalues, or it could include 
only a part of such eigenvalues. This makes no difference for the result of the 
theorem 4.6, it remains valid in either case. 

Proof. Let's denote by W the sum of eigenspaces of the operator /: 

W = V\i •■• V\ s . (4.11) 

In order to prove that the sum (4.11) is a direct sum we need to prove that for an 
arbitrary vector w G W the expansion 

w = vi + ... + v s , where \i&V\ v (4.12) 

is unique. For this purpose we consider the operator fi defined by formula 

s 

fi = Y[ h r 

The operator /j belongs to the polynomial envelope of the operator / and 

fM= ( II^ -Ar) J -Vj. (4.13) 



This follows from Vj G V\., which implies h r (vj) = (Xj — A r ) • Vj. The formula 
(4.13) means that fi(vj) = for all j ^ i. Applying the operator fi to both sides 
of the expansion (4.12), we get the equality 

/i(w)= ( n(A,-A r )) -v,. 

Hence, for the vector in the expansion (4.12) we derive 

/i(w) 



(4.14) 



CopyRight © Sharipov R.A., 2004. 
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The formula (4.14) uniquely determines all summands in the expansion (4.12) if 
the vector w e W is given. This means that the expansion (4.12) is unique and 
the sum of subspaces (4.11) is a direct sum. □ 

Definition4.3. A linear operator / : V — > V in a linear vector space V is 
called a diagonalizable operator if there is a basis ei , . . . , e n in the space V such 
that the matrix of the operator / is diagonal in this basis. 

Theorem 4.7. An operator / : V — > V is diagonalizable if and only if the sum 
of all its eigenspaces coincides with V. 

PROOF. Let / be a diagonalizable operator. Then we can choose a basis 
ei, ... , e„ such that its matrix F in this basis is diagonal, i.e. only diagonal 
elements F? of this matrix can be nonzero. Then the relationship (1.2), which 
determines the matrix F, is written as /(e^) = F\ ■ ej. Hence, each basis vector e$ 
is an eigenvector of the operator /, while \ = F\ is its associated eigenvalue. The 
expansion of an arbitrary vector v in this base is an expansion by eigenvectors 
of the operator /. Therefore, having collected together the terms with coinciding 
eigenvalues in this expansion, we get the expansion 

v = vi + . . . + v s , where v 4 e V\ t . 

Since v is an arbitrary vector of V, this means that V\ 1 + . . . + V\ s = V . The 
direct proposition of the theorem is proved. 

Conversely, suppose that Ai, . . . , A s is the total set of mutually distinct eigen- 
values of the operator / and assume that + . . . + V\ 3 = V. The theorem 4.6 
says that this is a direct sum: V = V\ 1 © . . . © V\ s = V. Therefore, choosing 
a basis in each eigenspace and joining them together, we get a basis in V (see 
theorem 6.3 in Chapter I). This is a basis composed by eigenvectors of the operator 
/, the application of / to each basis vector reduces to multiplying this vector by 
its associated eigenvalue. Therefore, the matrix F of the operator / in this basis 
is diagonal. Its diagonal elements coincide with the eigenvalues of the operator /. 
The theorem is proved. □ 

Assume that an operator / : V — > V is diagonalizable and assume that we have 
chosen a basis where its matrix is diagonal. Then the matrix Hf in formula (4.3) 
is also diagonal. Hence, we immediately derive the following formula: 

n 

det(/ - A ■ 1) = JJ(f? - A). 

Due to this equality we conclude that the characteristic polynomial of a diagona- 
lizable operator is factorized into the product of a linear terms and all roots 
of characteristic equation belong to the field K (not to its extension). This 
means that characteristic numbers of a diagonalizable operator coincide with its 
eigenvalues. This is a necessary condition for the operator / to be diagonalizable. 
However, it is not a sufficient condition. Even in the case of algebraically closed 
field of complex numbers IK = C there are non-diagonalizable operators in vector 
spaces over the field C. 
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§ 5. Nilpotent operators. 

Definition 5.1. A linear operator /: V — > V is called a nilpotent operator if 
for any vector v e V there is a positive integer number k such that / fc (v) = 0. 

According to the definition 5.1 for any vector v there is an integer number k 
(depending on v) such that / fe (v) = 0. The choice of such number has no upper 
bound, indeed, if m > k and / fe (v) = then .f" l (v) = 0. This means that there 
is a minimal positive number k = fc m ; n (depending on v) such that / fc (v) = 0. 
This minimal number fc m j n is called the height of the vector v respective to the 
nilpotent operator /. The height of zero vector is taken to be zero by definition; 
for any nonzero vector v its height is greater or equal to the unity. Let's denote 
the height of v by v(v) and define the number 

v(f) = maxz/(v). (5.1) 
vev 

For each vector v e V its height is finite, but the maximum in (5.1) can be infinite 
since the number of vectors in a linear vector space usually is infinite. 

Definition 5.2. In that case, where the maximum in the formula (5.1) is 
finite, a nilpotent operator / is called an operator of finite height and the number 
v(f) is called the height of a nilpotent operator /. 

Theorem 5.1. In a finite-dimensional linear vector space V the height v(f) of 
any nilpotent operator f : V — > V is finite. 

PROOF. Let's choose a basis ei, . . . , e„ in V and consider the heights of all 
basis vectors f(ei), . . . , v(e n ) with respect to /. Then denote 

m = max{z/(ei), . . . , v(e n )}. 

For an arbitrary vector v e V consider its expansion v = v 1 ■ ei + . . . + v n ■ e n . 
Then, applying the operator f m to v, we find 

n 

,r(v)=]>>'..r(e,)=0. (5.2) 

i=l 

Due to the formula (5.2) we see that the heights of all vectors of the space V are 
restricted by the number m. This means that the height of a nilpotent operator / 
is finite: v(f) — m < oo. □ 

Theorem 5.2. If f : V — ► V is a nilpotent operator and if U is an invariant 
subspace of the operator f, then the restricted operator f and the factoroperator 
f v . u both are nilpotent. 

PROOF. Any vector u of the subspace U C V is a vector of V. Therefore, there 
is an integer number k > such that / fe (u) = 0. However, the result of applying 
the restricted operator / the a vector of U coincides with the result of applying 
the initial operator / to this vector. Hence, we have 



(f ) fc u = / fc (u) = 0. 
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This proves that / is a nilpotent operator. In the case of factoroperator we 
consider an arbitrary coset Q in the factorspacc V/U. Let Q = Cl[/(v), where v is 
some fixed vector in Q, and let k = v(v) be the height of this vector v respective 
to the operator /. Then we can calculate 

(f v/u ) k Q = C\ u (.f k M) = o. 

Now it is clear that the factoroperator jy, v is a nilpotent operator. The theorem 
is completely proved. □ 

Theorem 5.3. A nilpotent operator f cannot have a nonzero eigenvalue. 

PROOF. Let A be an eigenvalue of a nilpotent operator / and let v ^ be an 
associated eigenvector. Then we have /(v) = A • v. On the other hand, since / is 
nilpotent, there is a number k > such that / fc (v) = 0. Then we derive 

/ fe (v) = A fe -v = 0. 

But v^O, therefore, A fc = 0. This is the equation for A and A = its unique root. 
The theorem is proved. □ 

It the finite-dimensional case this theorem can be strengthened as follows. 

Theorem 5.4. In a finite-dimensional space V of the dimension dim V = n any 
nilpotent operator f has exactly one eigenvalue A = with the multiplicity n. 

Proof. We shall prove this theorem by induction on n = dimV. In the case 
n = 1 we fix some vector v =/= in V and denote by k = f(v) its height. Then 
/ fe (v) = and / fe_1 (v) ^ 0. This means that w = / fe_1 (v) 7^ is an eigenvector 
of / with the eigenvalue A = since /(w) = / fc (v) = = • w. The base of the 
induction is proved. 

Suppose that the theorem is proved for any finite-dimensional space of the 
dimension less than n and consider a space V of the dimension n = dimT^. As 
above, let's fix some vector v 7^ in V and denote by k = v(v) its height 
respective to the operator /. Then / fc (v) = and w = / fe_1 (v) 7^ 0. Hence, for 
the nonzero vector w we get the following series of equalities: 

/(w) = /(/ fe - 1 (v))=/ fe (v)=0 = 0-w. 

Hence, w is an eigenvector of the operator / and A = is its associated eigenvalue. 
Let's consider the eigenspace U = Vq corresponding to the eigenvalue A = 0. 
Let's denote m = AimU 7^ 0. The restricted operator / is zero, hence, for 
characteristic polynomial of this operator / = we derive 

det{f v - A • 1) = (-A)"\ 

Now, applying the theorem 3.6, we derive the characteristic polynomial of /: 

det(/ - A • 1) = (-A) m det(f v/u - A • 1). (5.3) 

The factoroperator fy^ is an operator in factorspace V/U whose dimension n — m 
is less than n. Due to the theorem 5.2 the factoroperator f , is nilpotent, 
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therefore, we can apply the inductive hypothesis to it. Then for its characteristic 
polynomial of the factoroperator f v . u we get 

dct(/ v/c/ -A.l) = (-A)"-"\ (5.4) 

Comparing the above relationships (5.3) and (5.4), we find the characteristic 
polynomial of the initial operator /: 

det(/ - A • 1) = (-A) n . 

This means that A = is the only eigenvalue of the operator / and its multiplicity 
is n = dim V. The theorem is proved. □ 

Let / : V — > V be a linear operator. Consider a vector v € V and denote by 
k = v(v) its height respective to the operator /. This vector v produces the chain 
of k vectors according to the following formulas: 

y 1 = f k - 1 (v), V2 = / fe - 2 (v), ... , Vfc =/°(v)=v. (5.5) 

The chain vectors (5.5) are related with each other as follows: Vj = /(vj_i). Let's 
apply the operator / to each vector in the chain (5.5). Then the first vector vi 
vanished. Applying / to the rest k — 1 vectors we get another chain: 

wi = / fc - 1 (v), w 2 = / fe - 2 (v), . . . , w fe _! = /(v). (5.6) 

Comparing these two chains (5.5) and (5.6), we see that they are almost the same, 
but the second chain is shorter. It is obtained from the first one by removing the 
last vector v fc = v. 

The vector vi is called the side vector or the eigenvector of the chain (5.5). 
The other vectors are called the adjoint vectors of the chain. If the side vectors of 
two chains are different, then in these two chains there are no coinciding vectors at 
all. However, there is even stronger result. It is known as the theorem on « linear 
independence of chains*. 

Theorem 5.5. If the side vectors in several chains of the form (5.5) are linearly 
independent, then the whole set of vectors in these chains is linearly independent. 

Proof. We consider s chains of the form (5.5). In order to specify the chain 
vector we use two indices Vij. The first index i is the number of chain to which 
this vector v»j belongs, the second index j specifies the number of this vector 
within the i-th chain. Denote by k\, . . . , k s the lengths of our chains. Without 
loss of generality we can assume that the chains are arranged in the order of 
decreasing their lengths, i. e. we have the following inequalities: 

fci > k 2 > . . . > k s > 1. (5.7) 

Let k — maxjfci, ... ,k s }. We shall prove the theorem by induction on k. If 
k = 1 then the lengths of all chains are equal to 1. Therefore, they contain only 
the side vectors and have no adjoint vectors at all. The proposition of the theorem 
in this case is obviously true. 



§5. NILPOTENT OPERATORS. 



75 



Suppose that the theorem is valid for the chains whose lengths are not greater 
than k — 1. For our s chains, whose lengths are restricted by the number k, we 
consider a linear combination of all their vectors being equal to zero: 

s ks 

EE«y v y=°- (5-8) 

i=l J=l 

From this equality we should derive the triviality of the linear combination in 
its left hand side. Let's apply the operator / to both sides of (5.8) and use the 
following quite obvious relationships: 

1 v,, ; . forj>l. 
If we take into account (5.7), then the result of applying / to (5.8) is written as 

s k s r k r 

X Yl ai >j ■ /( v *j) = S S ai >i ■ vv < 1 = ,:r, - !,; ' 

i=l j=l i=l j=2 

In typical situation r — s. However, sometimes certain chains of vectors can drop 
from the above sums at all. This happens if a part of chains were of the length 
1. In this case r < s and k r+1 = . . . = k s = 1. The lengths of all chains in (5.7) 
cannot be equal to 1 since k > 1. 

Shifting the index j + 1 — > j in the last sum we can write (5.9) as follows: 

r kr — 1 

E E • v ^ = ( 5 - 10 ) 

i=i j=i 

The left side of the relationship (5.10) is again a linear combination of chain 
vectors. Here we have r chins with the lengths 1 less as compared to original 
ones in (5.8). Now we can apply the inductive hypothesis, which yields the linear 
independence of all vectors presented in (5.10). Hence, all coefficients of the linear 
combination in left hand side of (5.10) are equal to zero. When applied to (5.8) 
this fact means that the most part of terms in left hand side of this equality do 
actually vanish. The remainder is written as follows: 

s 

5^,1-^,1=0. (5.11) 

i=l 

Now in the linear combination (5.11) we have only the side vectors of initial chains. 
The are linearly independent by the assumption of the theorem. Therefore, the 
linear combination (5.11) is also trivial. From triviality of (5.10) and (5.11) it 
follows that the initial linear combination (5.8) is trivial too. We have completed 
the inductive step and thus have proved the theorem in whole. □ 

Let / : V — > V be a nilpotent operator in a linear vector space V and let v be a 
vector of the height k = v(v) in V. Consider the chain of vectors (5.5) generated 
by v and denote by U(v) the linear span of chain vectors (5.5): 



E7(v) = <vi, ... ,v fc ). 



(5.12) 
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Due to the theorem 5.5 the subspace U(v) is a finite-dimensional subspace and 
dim£/(v) = k. The chain vectors (5.5) form a basis in this subspace (5.12). The 
following relationships are derived directly from the definition of the chain (5.5): 



/(vi) = 0, 
/(v 2 ) = Vi, 

/(Vfe) = Vfe_i. 



(5.13) 



Due to (5.13) the subspace (5.12) is invariant under the action of the operator /. 
Hence, we can consider the restricted operator f v ^ an d, using (5.13), we can find 
the matrix of this restricted operator in the chain basis vi ... , Vfe: 



•4(0) = 



1 








(5.14) 



A matrix of the form (5.14) is called a Jordan block or a Jordan cage of a nilpotent 
operator. Its primary diagonal is filled with zeros. The upward next diagonal 
parallel to the primary one is filled with unities. All other space in the matrix 
(5.14) is filled with zeros again. The matrix (5.14) is a square k x k matrix, if 
k = 1, this matrix degenerates and becomes purely zero matrix with the only 
element: Ji(0) = || 0||. 

Let / : V — > V again be a nilpotent operator. We continue to study vector 
chains of the form (5.5). For this purpose let's consider the following subspaces: 



C/ fe =Ker/nIm/ 



fc-i 



(5.15) 



If u € Uk, then u G lmf k ^ 1 . Therefore, u = f (v) for some vector v. This 
means that u is a chain vector in a chain of the form (5.5). From the condition 
u e Kcr/ we derive /(u) = / fe (v) = 0. Hence, v is a vector of the height k and 
u is a side vector in the chain (5.5) initiated by the vector v. For the subspaces 
(5.15) we have the sequence of inclusions 



V Q = E7i D U 2 2 ■ ■ ■ 2 U k D 



(5.16) 



where Vb = Ker/ is the eigenspace corresponding to the unique eigenvalue A = 
of nilpotent operator /. The inclusions (5.16) follow from the fact that any chain 
(5.5) of the length k with the side vector u = f k ~ 1 (v) can be treated as a chain of 
the length fc— 1 by dropping the fc-th vector = v (see (5.5) and (5.6)). Then for 
the vector v' = /(v) we have u = / fe ~ 2 (v'). This yields the inclusion of subspaces 
U k C Uk-i for k > 1. 

In a finite-dimensional space V the height of any vector vel^is restricted by 
the height of the nilpotent operator / itself: 



v(v) < v(f) = m <C oo 



(see theorem 5.1). Therefore U m+ \ = {0}. Hence, the sequence of inclusions (5.16) 
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terminates on m-th step, i. e. we have a finite sequence of inclusions: 

V = Ui 2 U 2 2 . . . D U m D {0}. (5.17) 

Sequences of mutually enclosed subspaces of the form (5.16) or (5.17) are called 
flags, while each particular subspace in a flag is called a flag subspace. 

Theorem 5.6. For any nilpotent operator f in a finite-dimensional space V 
there is a basis in V composed by chain vectors of the form (5.5). Such a basis is 
called a canonic basis or a Jordan basis of a nilpotent operator f. 

Proof. The proof of the theorem is based on the fact that the flag (5.17) is 
finite. We choose a basis in the smallest subspace U m . Then we complete it up to 
a basis in U m -i, in U m -2, and so on backward along the sequence (5.17). As a 
result we construct a basis ei, ... , e s in Vq = Kcr /. Note that each vector in such 
a basis is a side vector of some chain of the form (5.5). For basis vectors of the 
subspace U m the lengths of such chains are equal to m. For the complementary 
vectors from U m -i their chins are of the length m — 1 and further the length of 
chains decreases step by step until the unity for the complementary vectors in 
largest subspace U\ =Vq. 

Let's join together all vectors of the above chains and let's enumerate them by 
means of double indices: e^. Here i is the number of the chain and j is the 
individual number of the vector within i-th chain. Then 

ei = e M , . . . , e s = e a ,i. 

Now let's prove that the set of all vectors from the above chains form a basis in V. 
The linear independence of this set of vectors follows from the theorem 5.5. We 
only have to prove that an arbitrary vector v € V can be represented as a linear 
combination of chain vectors e^j. We shall prove this fact by induction on the 
height of the vector v. 

If k = v(v) = 1, then v e Ker / = Vq. In this case v is expanded in the basis 
ei, . . . , e a of the subspace Vq. This is the base of induction. 

Now suppose that any vector of the height less than k can be represented as 
a linear combination of chain vectors e^j. Let's take a vector v of the height k 
and denote u = / fe_1 (v). Then /(u) = 0. This means that u is a side vector in a 
chain of the length k initiated by the vector v. Therefore, u is an element of the 
subspace Uk (see formula (5.15)); this vector can be expanded in the basis of the 
subspace Uk, which we have constructed above: 

r 

u = ^a; • ei. (5.18) 

i=l 

Note that in the expansion (5.18) we have only a part of vectors ei, ... , e s , 
namely, we have only those of them that belongs to Uk and, hence, are side vectors 
in the chains of the length not less than k. Therefore, we can write = / fe_1 (ej ; fc) 
for i = 1, ... , r. Substituting these expressions into (5.18), we obtain 

/ fc " 1 (v) = E«i-/ fc - 1 (e i ,fc). (5.19) 
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By means of the coefficients of the expansion (5.19) we determine the vector v': 

r 

v' = v-^a; ■ei.fc. (5.20) 

i=l 

Applying the operator f k ^ 1 to v' and taking into account (5.19), we find 

r 

/ fe -V) = f k -\v) E a « • ffe_1 ( e a) = o. 



i=l 

Hence, the height of the vector v' is less than k and we can apply the inductive 
hypothesis to it. This means that v' can be represented as a linear combination of 
chain vectors etj. But v is expressed through v' as follows: 

r 

v = v' + ^a r e hk . 

i=\ 

Then v can also be expressed as a linear combination of chain vectors e^j. The 
inductive step is completed and the theorem in whole is proved. □ 

In the basis composed by chain vectors, the existence of which was proved in 
theorem 5.6, the matrix of nilpotent operator / has the following form: 



F 



(5.21) 



The matrix (5.21) is blockwise-diagonal, its diagonal blocks are Jordan cages of 
the form (5.14), all other space in this matrix is filled with zeros. It is easy 
to understand this fact. Indeed, each chain with the side vector produces 
the invariant subspacc U(v) of the form (5.12), where v = e^. Due to the 
theorem 5.6 the space V is the direct sum of such invariant subspaces: 

v = u(e llkl )®...®u(e a , ka ). 

The matrix (5.21) is called a Jordan form of the matrix of a nilpotent operator. 
The theorem 5.6 is known as the theorem on bringing the matrix of a nilpotent 
operator to a canonic Jordan form. If the chain basis ei, ... , e s is constructed 
strictly according to the proof of the theorem 5.6, then Jordan cages are arranged 
in the order of decreasing sizes: 

fci ^ k 2 > . . . ^ k s . 

However, the permutation of vectors ei, ... , . . . e s can change this order, and this 
usually happens in practice. 
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Theorem 5.7. The height of a nilpotent operator f in a finite-dimensional space 
V is less or equal to the dimension n = dim V of this space and f n = 0. 

Proof. Above in proving the theorem 5.1 we noted that the height v{f) of a 
nilpotent operator / coincides with the greatest height of basis vectors. Due to 
the theorem 5.6 now we can choose the chain basis. The height of a chain vector is 
not greater than the length of the chain (5.5) to which it belongs. Therefore, the 
height of basis vectors in a chain basis is not greater than the number of vectors 
in such a basis. This yields v(f) < n = AimV . The height of an arbitrary vector 
v of V is not greater than the height of the operator /. Therefore, /™(v) = for 
all veK This means that /" = 0. The theorem is proved. □ 

§ 6. Root subspaces. Two theorems 
on the sum of root subspaces. 

Definition 6.1. The root subspace of a linear operator / : V — > V correspon- 
ding to its eigenvalue A is the set 

V{X)={veV: 3k({keN) & ((/ - A • l) fe v = 0))} 

that consist of vectors vanishing under the action of some positive integer power 
of the operator h\ = f — A • 1. 

For each positive integer k we define the subspace V(k,X) = Kcr(h\) k . For 
k = 1 the subspace V(l, A) coincides with the eigenspace V\. Note that {h\) k v = 
implies {h\) k+1 v = 0. Therefore we have the sequence of inclusions 

V(l, A) C V(2, A) C . . . C V(k, A) C . . . (6.1) 

It is easy to see that all subspaces in the sequence (6.1) are enclosed into the root 
subspace V(X). Moreover, V(X) is the union of the subspaces (6.1): 

oc oo 

V(X)= (J ^(M) = £>(M). (6-2) 
fe=l fe=l 

In this case the sum of subspaces the sum of subspaces V(k, A) coincides with their 
union. Indeed, let v be a vector of the sum of subspaces V(k, A). Then 

v = v fel +... +v fcs , where v^ s e V(k s ,X). (6.3) 

Let k — maxjfci,... , k s }, then from the sequence of inclusions (6.1) we derive 
v fc . e V(k, A). Therefore the vector (6.3) belongs to V(k,X), hence, it belongs to 
the union of all subspaces V(k, A). 

The proof of coincidence of the sum and the union in (6.2) is based on the 
inclusions (6.1). Therefore, we have proved the more general theorem. 

Theorem 6.1. The sum of a growing sequence of mutually enclosed subspaces 
coincides with their union. 

The theorem 6.1 shows that the set V(X) in definition 6.1 is actually a subspace 
in V. This subspace is nonzero since it comprises the eigenspace V\ as a subset. 
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Theorem 6.2. A root subspace V(X) of an operator f is invariant under the 
action of f and of all operators from its polynomial envelope P(f). 

Proof. Let v S V(X). Then there exists a positive integer number k such that 
(h\) k v = 0. Let's consider the vector w = /(v). For this vector we have 

(h x ) k w = (h x ) k o / v = / o (h x ) k v = f((h x ) k (y)) = 0. 

Here we used the permutability of the operators h\ and /, it follows from the 
inclusion h\ € P (/). Due to the above equality we have w = /(v) € V(A). The 
invariance of V(X) under the action of / is proved. Its invariance under the action 
of operators from P(f) now follows from the theorem 4.5. □ 

Theorem 6.3. Let A and \x be two eigenvalues of a linear operator f: V — > V. 
Then the restriction of the operator h\ = f — A • 1 to the root subspace V{n) is 

(1) a bijective operator if /i =/= X; 

(2) a nilpotent operator if /i = X. 

PROOF. Let's prove the first proposition of the theorem. We already know that 
the subspace V(p) is invariant under the action of h\. For the sake of convenience 
we denote by h\ itl the restriction of h\ to the subspace V{n). This is an operator 
acting from V{n) to V{n). Let's find its kernel: 

Kechx* ={ve%): h x [y) = 0} = Kcr h\ n V(fi). 

The kernel of the operator h\ by definition coincides with the cigenspace V\. 
Therefore, Kcr/i A , M = V\C\ V{u). 

Let v be an arbitrary vector of the kernel Kerh\ iti . Due to the above result v 
belongs to V\. Therefore, we have the equality 

/(v) = A • v. (6.4) 

Simultaneously, we have the other condition v e V{n) which means that there 
exists some integer number k > such that 

(^) fe v = (/- M -l) fc v = 0. (6.5) 

From (6.4) we get /i M (v) = /(v) — fi ■ v = (A — n) ■ v. Combining this equality with 
(6.5), we obtain the following equality for v: 

(M fc v= (A- M ) fe -v = 0. 

Therefore, if A 7^ jj,, we immediately get v = 0, which means that Kerh\ tfl = {0}. 
Hence, in the case A 7^ fi the operator h\ ifl : V(p) — > V{n) is injective. The 
surjectivity of this operator and, hence, its bijectivity follows from its injectivity 
due to the theorem 1.3. 

Now let's prove the second proposition of the theorem. In this case \x = A, 
therefore, we consider the operator h\ t \ being the restriction of h\ to the subspace 
V(X). Note that h\.\ v = h\ v for all v e V{X). Therefore, from the definition of a 
root subspace we conclude that for any vector v e V(X) there is a positive integer 
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number k such that (h\,x) k v = (/ — A • l) k v = 0. This equality means that h\.\ 
is a nilpotent operator in V(X). The theorem is proved. □ 

Theorem 6.4. Let Ai, . . . , A s be a set of mutually distinct eigenvalues of a 
linear operator f : V — > V. Then the sum of corresponding root subspaces is a 
direct sum: V(Ai) + . . . + V(X S ) = V(Ai) ... V(A a ). 

Proof. The proof of this theorem is similar to that of theorem 4.6. Denote by 
W the sum of subspaces specified in the theorem: 

W = V{X 1 ) + ... + V(X S ). (6.6) 

In order to prove that the sum (6.6) is a direct sum we should prove the uniqueness 
of the following expansion for an arbitrary vector w e W: 

w = Vi + ... + v s , where Vj £ V(Xi). (6.7) 

Consider another expansion of the same sort for the same vector w: 

w = vi + . . . + v s , where v t £ V(A,). (6.8) 

Then let's subtract the second expansion from the first one and for the sake of 
brevity denote w t = (v* — Vj) € V(Xi). As a result we get 

wi + ... + w s =0. (6.9) 

Denote h r = f — X r ■ 1. According to the definition of the root subspace V(X r ), 
for any vector w r in the expansion (6.9) there is some positive integer number k r 
such that (h r ) kr w r = 0. We use this fact and define the operators 

s 

fi = H(hr) kr - (6.10) 

Due to the permutability of the operators hi , ... , h s belonging to the polynomial 
envelope of the operator / and due to the equality (h r ) kr w r =0wc get 

fi{wj) = for all j ^ i. 

Let's apply the operator (6.10) to both sides of the equality (6.9). Then all terms 
in the sum in left hand side of this equality do vanish, except for i-th term only 
This yields /,(wj) = 0. Let's write this equality in expanded form: 

w s = (6.11) 

The vector Wj belongs to the root space V(Aj), which is invariant under the action 
of all operators h r in (6.11). Therefore we can replace the operators h r in (6.11) 
by their restrictions h r j to the subspace V(Aj): 



w, = 0. (6.12) 





82 



CHAPTER II. LINEAR OPERATORS. 



According to the theorem 6.3, the restricted operators h r ,i are bijective if r ^ i. 
The product (the composition) of bijective operators is bijective. We also know 
that applying a bijective operator to nonzero vector we would get a nonzero result. 
Therefore, (6.12) implies w, = 0. Then Vj = Vj and the expansions (6.7) and 
(6.8) do coincide. The uniqueness of the above expansion (6.7) and the theorem in 
whole are proved. □ 

Theorem 6.5. Let f be a linear operator in a finite-dimensional space V over 
the Geld K and suppose that its characteristic polynomial factorizes into a product 
of linear terms in K. Then the sum of all root subspaces of the operator f is equal 
to V, i. c. V(Xi) ... V(X S ) = V, where Ai, . . . , A s is the set of all mutually 
distinct eigenvalues of the operator f. 

PROOF. Since Ai, ... , A s is the set of all mutually distinct eigenvalues of the 
operator /, for its characteristic polynomial we get 

s 

det(/-A-l) = JJ(Ai-A)"<. 

i=i 

According to the hypothesis of theorem, it is factorized into a product of linear 
polynomials of the form Ai — A, where Ai is an eigenvalue of / and rii is the 
multiplicity of this eigenvalue. Let's denote by W the total sum of all root 
subspaces of the operator /, we know that this is a direct sum (see theorem 6.4): 

W = V(X 1 )®...®V(X S ). 

The root subspaces are nonzero, hence, W ^ {0}. 

Further proof is by contradiction. Assume that the proposition of the theorem 
is false and W ^ V. The subspace W is invariant under the action of / as a 
sum of invariant subspaces V{Xi) (see theorem 3.2). Due to the theorem 4.5 it is 
invariant under the action of the operator h\ = f — A • 1 as well. Let's apply the 
theorem 3.5 to the operator h\. This yields 

det(/ - A • 1) = det(/ w - A • 1) det(/ y/w - A • 1). (6.13) 

Here we took into account that l w = 1 and l y , w = 1, we also used the 
theorem 3.4. The characteristic polynomial of the operator / is the product of 
characteristic polynomial of restricted operator / and that of factoroperator 
f v i w - The left hand side of (6.13) factorizes into a product of linear polynomials 
in K, therefore, each of the polynomials in right hand side of (6.13) should do 
the same. Let X q be one of the eigenvalues of the factoroperator f v , w and let 

Q e V/W be the corresponding eigenvector. Due to (6.13) the number X q is in the 
list Ai, . . . , A s of eigenvalues of the operator /. Due to our assumption W ^ V wc 
conclude that the factorspace V/W is nontrivial: V/W ^ {0}, and the coset Q is 
not zero. Suppose that v € Q is a representative of this coset Q. Since Q ^ 0, we 
have v ^ W. The coset Q is an eigenvector of the factoroperator f v j w , therefore, 
it should satisfy the following equality: 



(fv/w - V l) Q = CW((/ - V l) v) = 



(6.14) 
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Let's denote h r = f — X r ■ 1 for all r = 1, ... , s (we have already used this notation 
in proving the previous theorem). The relationship (6.14) means that 

(/-A,d)v = /i,(v) = weW. (6.15) 

From the expansion W — V(X\) © ... V(X S ) for the vector w, which arises in 
formula (6.15), we get the expansion 

h q (v) = w = vi + . . . + v s , where v 4 g V(X l ). (6.16) 

Let's consider the restriction of the operator h q to the root subspace V(Aj), this 
restriction is denoted h q ^ (see the proof of theorem 6.4). Due to the theorem 6.3 
we know that the operators h q ^ : V(Xi) — > V(Aj) are bijective for all i ^ q. 
Therefore, for all vi, ... , v s in (6.16) other than v q we can find v, g V(Xi) such 
that Vi = hq t i(vi). Let's substitute these expressions into (6.16). Then we get 

s 

w = fc,(v)=v g + 5^/i g (vi). (6.17) 
Relying upon this formula (6.17), we define the new vector v q : 

s 

v 9 =v-^v,. (6.18) 

For this vector from (6.17) we derive h q (y q ) = v q g V(X q ). Due to the definition 
of the root subspace V(X q ) there exists a positive integer number k such that 
(h q ) k w q = 0. Hence, (h q ) k+1 v q = and, therefore, \r q £ V(X q ). Returning back 
to the formula (6.18), we derive 

s 

v = Vi, where v 4 g V{X l ). (6.19) 

i=l 

From the formula (6.19) and from the expansion^ = V{X\)®. . .®V{X S ) it follows 
that v g W, but this contradicts to our initial choice v g' W, which was possible 
due to the assumption W ^ V. Hence, W = V. The theorem is proved. □ 

§ 7. Jordan basis of a linear operator. 
Hamilton-Cayley theorem. 

Let / : V — > V be a linear operator in finite-dimensional linear vector space V. 
Suppose that V is expanded into the sum of root subspaces of the operator /: 

V = V(X 1 )@ ...@V{X S ). (7.1) 

Let's denote hi = f — Aj • 1. Then denote by hij the restriction of hi to V(Xj). 
According to the theorem 6.3, the restriction hi t i is a nilpotent operator in i-th 
root subspace V(Aj). Therefore, in V(Aj) we can choose a canonic Jordan basis 
for this operator (see theorem 5.6). The matrix of the operator h iA in canonic 
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Jordan basis is a matrix of the form (5.21) composed by diagonal blocks, where 
each diagonal block is a matrix of the form (5.14). 

Definition 7.1. A Jordan normal basis of an operator /: V —* V is a basis 
composed by canonic Jordan bases of nilpotent operators hi y i in the root subspaces 
V(Xi) of the operator /. 

Note that an operator / in a finite dimensional space V possesses a Jordan 
normal basis if and only if there V is expanded into the sum of root subspaces of 
the operator /, i.e. if we have (7.1). The theorem 6.5 yields a sufficient condition 
for the existence of a Jordan normal basis of a linear operator. 

Suppose that an operator / in a finite-dimensional linear vector space V 
possesses a Jordan normal basis. The subspaces V(Aj) in (7.1) are invariant with 
respect to /. Let's denote by fi the restriction of / to V(Aj). The matrix of the 
operator / in a Jordan normal basis is a blockwise-diagonal matrix: 



F 



Fi 



(7.2) 



The diagonal blocks Fi in (7.2) are determined by operators fi. Note that the 
operators fi and /i^, are related to each other by the equality fi = h iyi + Aj • 1. 
Therefore, Fj is also a blockwise-diagonal matrix: 



F 



J fel (A,) 



(7.3) 



The number of diagonal blocks in (7.3) is determined the number of chains in a 
canonic Jordan basis of the nilpotent operator h iyi , while these diagonal blocks 
themselves are matrices of the following form: 



J fe (A) 



A 1 
A 







(7.4) 



A matrix of the form (7.4) is called a Jordan block or a Jordan cage with A on 
the diagonal. This is square k x k matrix; if k = 1 this matrix degenerates and 
becomes a matrix with the single element Ji(A) = || A j|. 

The matrix of an operator / in a Jordan normal base presented by the 
relationships (7.2), (7.3), and (7.4) is called a Jordan normal form of the matrix 
of this operator. The problem of constructing a Jordan normal basis for a linear 
operator / and thus finding the Jordan normal form F of its matrix is known as 
the problem of bringing the matrix of a linear operator to a Jordan normal form. 
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If the matrix of a linear operator can be brought to a Jordan normal form, this 
fact has several important consequences. Note that a matrix of the form(7.4) is 
upper-triangular. Hence, (7.3) and (7.2) all are upper-triangular matrices. The 
entries on the diagonal of (7.2) are the eigenvalues of the operator /, the i-th 
eigenvalue Aj being presented rn times, where rij = dimV(Xi). From the course of 
algebra we know that the determinant of an upper-triangular matrix is equal to 
the product of all its diagonal elements. Therefore, the characteristic polynomial 
of an operator possessing a Jordan normal basis is given by the formula 

s 

dct(/ - A • 1) = - A)V (7.5) 

i=l 

Theorem 7.1. The matrix of a linear operator f in a Gnite-dimensional linear 
vector space V over a numeric field K can be brought to a Jordan normal form if and 
only if its characteristic polynomial factorizes into the product of linear polynomials 
in the field K. 

Proof. The necessity of the condition formulated in the theorem 7.1 is imme- 
diate from (7.5); the sufficiency is provided by the theorems 5.6 and 6.5. □ 

In the case of the field of complex numbers C any polynomial factorizes into a 
product of linear terms. Therefore, the matrix of any linear operator in a complex 
linear vector space can be brought to a Jordan normal form. 

Theorem 7.2. The multiplicity of an eigenvalue X of a linear operator f in a 
Gnite-dimensional linear vector space V is equal to the dimension of the correspon- 
ding root subspace V(X). 

For the operator /, the characteristic polynomial of which factorizes into 
the product of linear terms, the proposition of theorem 7.2 immediately follows 
from the formula (7.5). However, this fact is valid also in the case of partial 
factorization. Such a case can be reduced to the case of complete factorization 
by means of the field extension technique. We do not consider the field extension 
technique in this book. But it is worth to note that the complete proof of the 
following Hamilton-Cayley theorem is also based on that technique. 

Theorem 7.3. Let P(X) be the characteristic polynomial of a linear operator 
f in a finite-dimensional space V. Then P(f) = 0. 

PROOF. We shall prove the Hamilton-Cayley theorem for the case where the 
characteristic polynomial P(X) factorizes into the product of linear terms: 

s 

P(X)=l[(X l -XT*. (7.6) 

i=i 

Denote hi := / — A, • 1 and denote by hij the restriction of hi to the root subspace 
V(Xj). Then from the formula (7.6) we derive 

P(f) = f[(hi)^. 
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Let's apply P(f) to an arbitrary vector v e V. Due to the theorem (6.5) we can 
expand v into a sum v = vi + . . . + v s , where v» G V(\i). Therefore, we have 

P(/)v = P(/) Vl + ... + P(/)v s . (7.7) 

The root subspace V(Xj) is invariant under the action of the operators hi. Then 

s s 

n/)v, = n(^)" 4v , = n(^) ritv ^ 

i=l i=l 

Using permutability of the operators hi and their restrictions hij, we can bring 
the above expression for P(f) Vj to the following form: 

s 

^v^n^rM^P'v,, (7.8) 

The operator hjj is a nilpotcnt operator in the subspace V(Xj) and rij — 
dimV(Xj). Therefore, we can apply the theorem 5.7. As a result we obtain 
(hjj) n i Vj = 0. Now from (7.7) and (7.8) for an arbitrary vector v e V we derive 
P(/) v = 0. This proves the theorem for the special case, where the characteristic 
polynomial of an operator / factorizes into a product of linear terms. The general 
case is reduced to this special case by means of the field extension technique, 
which we do not consider in this book. □ 
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§ 1. Linear functionals. 
Vectors and covectors. Dual space. 

Definition 1.1. Let V be a linear vector space over a numeric field K. A 
numeric function y = /(v) with vectorial argument v e V and with values y € IK 
is called a linear functional if 

(1) /(vi + v 2 ) = /(vi) + /(v 2 ) for any two Vi,v 2 G V; 

(2) /(a • v) = a /(v) for any veF and for any a € K. 

The dehnition of a linear functional is quite similar to the definition of a linear 
mapping (see definition 8.1 in Chapter I). Comparing these two definitions, we see 
that any linear functional / is a linear mapping / : V — > K and, conversely, any 
such linear mapping is a linear functional. Thereby the numeric field IK is treated 
as a linear space of the dimension 1 over itself. 

Linear functionals, as linear mappings from V to IK, constitute the space 
Hom(V, K), which is called the dual space or the conjugate space for the space 
V. The dual space Hom(V, K) is denoted by V*. The space of homomorphisms 
Hom(y, W) is usually determined by two spaces V and W. However, the dual 
space is an exception V* = Hom(V,K), it is determined only by V since K is 
known whenever V is given (see definition 2.1 in Chapter I). 

Thus, V* = Hom(V, K) is a linear vector space over the same numeric field 
K as V. If V is finite-dimensional, then the dimension of the conjugate space is 
determined by the theorem 10.4 in Chapter I: dim V* = dim V. The structure of a 
linear vector space in V* = Hom(y, K) is determined by two algebraic operations: 
the operation of pointwise addition and pointwise multiplication by numbers (see 
definitions 10.1 and 10.2 in Chapter I). However, it would be worth to formulate 
these two definitions especially for the present case of linear functionals. 

Definition 1.2. Let / and g be two linear functionals of V*. The sum of 
functionals / and g is a functional h whose values are determined by formula 
h(v) = /(v) + ff(v) for all v e V. 

Definition 1.3. Let / be a linear functional of V*. The product of the 
functional / by a number a £ K is a functional h whose values are determined by 
formula h(v) = a ■ /(v) for all v e V. 

Let V be a finite-dimensional vector space over a field IK and let ei, ... , e„ be 
a basis in V. Then each vector veF can be expanded in this basis: 



v = v 1 ■ ei + . . . + v n ■ e. 



(1.1) 



88 



CHAPTER III. DUAL SPACE. 



Let's consider i-th coordinate of the vector v. Due to the uniqueness of the 
expansion (1.1), when the basis is fixed, v l is a number uniquely determined by 
the vector v. Hence, we can consider a map h l : V —> K, defining it by formula 
ft*(v) = v % . When adding vectors, their coordinates are added; when multiplying 
a vector by a number, its coordinates are multiplied by that number (see the 
relationships (5.4) in Chapter I). Therefore, ft* : V — > K is a linear mapping. This 
means that each basis e x , ... , e„ of a linear vector space V determines n linear 
functionals in V*. The functionals ft 1 , . . . , ft™ are called the coordinate Junctionals 
the basis ei, ... , e„. They satisfy the relationships 

h i (e j ) = S i j , (1.2) 

where 5j is the Kronecker symbol. These relationships (1.2) are called the 
relationships of biorthogonality. 

The proof of the relationships of biorthogonality is very simple. If we expand 
the vector ej in the basis ei, ... , e„, then its j-th component is equal to unity, 
while all other components are equal to zero. Note that ft*( e j) is a number equal 
to «-th component of the vector e 3 . Therefore, ft*( e j) = 1 if i = j and h l (ej) = 
in all other cases. 

Theorem 1.1. Coordinate functionals h 1 , ... , h™ are linearly independent; 
they form a basis in dual space V* . 

PROOF. Let's consider a linear combination of the coordinate functionals asso- 
ciated with a basis ei, ... , e„ in V and assume that it is equal to zero: 

c*i • ft 1 + ... + «„• ft" = 0. (1.3) 

Right hand side of (1.3) is zero functional. Its value when applied to the base 
vector a,- is equal to zero. Hence, we have 

ai h\e 3 ) + ... + a n h n (e 3 ) = 0. (1.4) 

Now we use the relationships of biorthogonality (1.2). Due to these relationships 
among n terms h (ej), . . . , h n (ej) in left hand side of the equality (1.4) only one 
term is nonzero: hj(ej) = 1. Therefore, (1.4) reduces to dj = 0. But j is an 
index that runs from 1 to n. Hence, all coefficients of the linear combination 
(1.3) are zero, i.e. it is trivial and coordinate functionals ft 1 , ... , ft" are linearly 
independent. 

In order to complete the proof of the theorem now we could use the equality 
dim V* = dimy = n and refer to the item (4) of the theorem 4.5 in Chapter I. 
However, we choose more explicit way and directly prove that coordinate functio- 
nals ft 1 , ... , ft" span the dual space V* . Let / € V* be an arbitrary linear 
functional and let v be an arbitrary vector of V. Then from (1.1) we derive 

/(v) = v 1 /( ei ) + ... + v n /(e„) = /( ei ) ftV) + . . . + /(e„) ft"(v). 

Here /(ei), ... , /(e„) are numeric coefficients from K and v is an arbitrary 
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vector of V. Therefore, the above equality can be rewritten as an equality of linear 
functional in the conjugate space V*: 

f = /(ei) • h 1 + . . . + f(e n ) ■ ft". (1.5) 

The formula (1.5) shows that an arbitrary function / e V* can be represented as 
a linear combination of coordinate functionals ft 1 , ... , ft™. Hence, being linearly 
independent, they form a basis in V* . The theorem is proved. □ 

Definition 1.4. The basis h 1 ,. . . , ft™ in V* formed by coordinate functionals 
associated with a basis ei, . . . , e„ in V is called the dual basis or the conjugate 
basis for ei, ... , e„. 

Definition 1.5. Let / be a linear functional in a finite-dimensional space V 
and let ei, . . . , e n be a basis in this space. The numbers fi, ■ ■ ■ , f n determined 
by the linear functional / according to the formula 

fi = /(^) (1.6) 

are called the coordinates or the components of / in the basis ei , ... , e„ . 

As we see in formula (1.5), the numbers (1.6) arc the coefficients of the 
expansion of / in the conjugate basis ft 1 , ... , ft". However, in the definition 1.5 
they are mentioned as the components of / in the basis ei, ... , e„. This is purely 
terminological trick, it means that we consider ei, ... ,e n as a primary basis, 
while the conjugate basis is treated as an auxiliary and complementary thing. 

The algebraic operations of addition and multiplication by numbers in the 
spaces V and V* are related to each other by the following equalities: 



/(vi+v 2 ) = /(vi) + /(v 2 ), /(a-v) = a/(v); 

(/i+/ 2 )(v) = / 1 (v) + / 2 (v), («-/)(v)=a/(v). 

Vectors and linear functionals enter these equalities in a quite similar way. The 
fact that in the writing /(v) the functional plays the role of a function, while the 
vector is written as an argument is not so important. Therefore, sometimes the 
quantity /(v) is denoted differently: 

/(v) = (/|v). (1.8) 

The writing (1.8) is associated with the special terminology Functionals from the 
dual space V* are called covectors, while the expression (/ | v) itself is called the 
pairing, or the contraction, or even the scalar product of a vector and a covector. 

The scalar product (1.8) possesses the property of bilincarity: it is linear in its 
first argument / and in its second argument v. This follows from the relationships 
(1.7), which are now written as 



(/i+/ 2 |v) = (/ 1 |v) + (/ 2 |v), (a-/|v)=a(/|v); 
(/|v 1 +v 2 ) = (/|v 1 ) + (/|v 2 ), {f\a-v)=a(f\v). 

We have already dealt with the concept of bilinearity earlier in this book (see 
theorem 1.1 in Chapter II). 
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The properties (1.9) of the scalar properties (1.8) are analogous to the properties 
of the scalar product of geometric vectors — it is usually studied in the course 
analytic geometry (see [5]). However, in contrast to that «geometric» scalar 
product, the scalar product (1.8) is not symmetric: its arguments belong to 
different spaces — they cannot be swapped. Covectors in the scalar product (1.8) 
are always written on the left and vectors are always on the right. 

The following definition is dictated by the intension to strengthen the analogy 
of (1.8) and traditional «geometric» scalar product. 

Definition 1.7. A vector v and a covector / are called orthogonal to each 
other if their scalar product is zero: (/ | v) = 0. 

Theorem 1.2. Let U C. V be a subspace in a finite-dimensional vector space V 
and let v £ U. Then there exists a linear functional f in V* such that /(v) ^ 
and /(u) = for all ueU. 

PROOF. Let dimF = n and dim?/ = s. Let's choose a basis ei, ... , e s in 
a subspace U. Let's add the vector v to basis vectors ei, . . . , e s and denote it 
v = e s +i. The extended system of vectors is linearly independent since v ^ U, 
see the item (4) of the theorem 3.1 in Chapter I. Denote by W = (ei, ... , e s +i) 
the linear span of this system of vectors. It is clear that W is a subspace of 
V comprising the initial subspace U; its dimension is one as greater than the 
dimension of U. The vectors ei, ... , e s+1 form a basis in W. If W ^ V, then 
we complete the basis e l7 ... , e s+1 up to a basis e 1; ... , e„ in the space V and 
consider the coordinate functionals h 1 , . . . ,h n associated with this base. Let's 
denote / = h s+1 . Then from the relationships of biorthogonality (1.2) we derive 

/(v) = h s+1 (e s+1 ) = 1 and f(e t ) = for i = l,...,s. 

Being zero on the basis vectors of the subspace U, the functional / = h s+1 vanishes 
on all vector u <G U. Its value on the vector v is equal to unity. □ 

Let's consider the case U = {0} in the above theorem. Then for any nonzero 
vector v we have v ^ U, and we can formulate the following corollary of the 
theorem 1.2. 

COROLLARY. For any vector v ^ in a finite-dimensional space V there is a 
linear functional f in V* such that /(v) ^ 0. 

Let V be a linear vector space over the field K and let W — V* be the 
conjugate space of V. We know that W is also a linear vector space over the field 
K. Therefore, it possesses its own conjugate space W* . With respect to V this 
is the double conjugate space V** . We can also consider triple conjugate, fourth 
conjugate, etc. Thus we would have the infinite sequence of conjugate spaces. 
However, soon we shall see, that in the case of finite-dimensional spaces there is 
no need to consider the multiple conjugate spaces. 

Let v e V. To any / G V* we associate the number /(v) e K. Thus we define 
a mapping tp v : V* — > K, which is linear due to the following relationships: 

Mh + h) = (h + / 2 )(v) = A(v) + / 2 (v) = M.h) + M/ 2 ), 



</?v(a •/) = («• /)(v) = a/(v) = atp v (f). 
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Hence, tp v is a linear functional in the space V* or, in other words, it is an element 
of double conjugate space. The functional ip v is determined by a vector v e V. 
Therefore, when associating <j> v with a vector v, we define a mapping 

h:V^V**, where h(v) = tp v for all v E V. (1.10) 

The mapping (1.10) is a linear mapping. In order to prove this fact we should 
verify the following identities for this mapping h: 

h{vi + v 2 ) = /i(vi) + /i(v 2 ), h{a-v) =ah(v). (1.11) 

The result of applying h to a vector of the space V is an element of double 
conjugate space V** . Therefore, in order to verify the equalities (1-11) we should 
apply both sides of these equalities to an arbitrary covector /eV* and check the 
coincidence of the results that we obtain: 

Mvi+v 2 )(/) = ¥>v 1+ v 2 (/) = /(vi) + /(v 2 ) = <M/)+ 

+ Vv2 (f) = fc(vi)(/) + h(v 2 )(f) = (fc(vi) + /i(v 2 ))(/), 

h(a • v)(/) = <p a . v (f) = /(« ' v) = a/(v) = 

= a^(/)=aft(v)(/) = (a-Mv))(/). 

Theorem 1.3. For a finite-dimensional linear vector space V the mapping (1.10) 
is bijective. It is an isomorphism of the spaces V and V** . This isomorphism is 
called canonic isomorphism of these spaces. 

PROOF. First of all we shall prove the injectivity of the mapping (1.10). For 
this purpose we consider its kernel Ker h. Let v be an arbitrary vector of Ker h. 
Then ip v = h(v) = 0. But ip v e this means that <p is a linear functional in the 
space V*. Therefore, the equality ip v = means that </?(/) = for any covector 
/ G V*. Using this equality, from (1.10) we derive 

h(v)(f) = Vv (f) - f(v) = for all / e V*. (1.12) 

Now let's apply the corollary of the theorem 1.2. If the vector v would be nonzero, 
then there would be a functional / such that /(v) ^ 0. This would contradict 
the above condition (1.12). Hence, v = by contradiction. This means that 
Ker h = {0} and h is an injective mapping. 

In order to prove the surjectivity of the mapping (1.10) we use the theorem 9.4 
from Chapter I. According to this theorem 

dim(Kcr h) + dim(Im h) = dim V. 

We have already proved that dim(Kerft-) = 0. Hence, dim(Im/i) = dimU. Since 
Imh is a subspace of V** and dimV** = dimV* = dimU, we have Imh = V** 
(see item (3) of theorem 4.5 in Chapter I). This completes the proof of surjectivity 
of the mapping h and the proof of the theorem in whole. □ 
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Canonic isomorphism (1.10) possesses the property that for any vector v e V 
and for any covector / e V* the following equality holds: 

(h(v) | /> = </!«>■ (1.13) 

The equality (1.13) is derived from the definition of ft. Indeed, (ft(v) | /) = 
M v )(/) = fv(f) = /(v) = (,/|v). The relationship (1.13) distinguishes canonic 
isomorphism among all other isomorphisms relating the spaces V and V**. 

§ 2. Transformation of the coordinates 
of a covector under a change of basis. 

Let V be a finite-dimensional linear vector space and let V* be the associated 
dual space. If we treat V* separately forgetting its relation to V, then a choice of 
basis and a change of basis in V* are quite the same as in any other linear vector 
space. However, the conjugate space V* is practically never considered separately. 
The theory of this space should be understood as an extension of the theory of 
initial space V. 

Let ei, ... , e n be a basis in a linear vector space V. Each such basis of V 
has the associated basis of coordinate functionals in V*. Choosing another basis 
§i, . . . , e n in V we immediately get another conjugate basis h 1 , . . . , h™ in V*. 
Let S be the transition matrix for passing from the old basis ei, . . . , e n to the new 
basis §i, ... , e„. Similarly, denote by P the transition matrix for passing from 
the old dual basis ft 1 , . . . , ft™ to the new dual basis ft 1 , ... , h n . The components 
of these two transition matrices S and P are used to expand the vectors of «wavy» 
bases in corresponding «non-wavy» bases: 

n n 

<■, »:•'<• ~hr = Y,p r s-h s . (2.i) 

i=l s=l 

Note that the second formula (2.1) differs from the standard given by formula 
(5.5) in Chapter I: the vectors of dual bases in (2.1) are specified by upper indices 
despite to the usual convention of enumerating the basis vectors. The reason is 
that the dual space V* and the dual bases are treated as complementary objects 
with respect to the initial space V and its bases. We have already seen such 
deviations from the standard notations in constructing the basis vectors Ej in 
Hom(V, W) (see proof of the theorem 10.4 in Chapter I). 

In spite of the breaking the standard rules in indexing the basis vectors, the 
formula (2.1) does not break the rules of tensorial notation: the free index r is 
in the same upper position in both sides of the equality, the summation index s 
enters twice — once as an upper index and for the second time as a lower index. 

Theorem 2.1. The transition matrix P for passing from the old conjugate 
basis ft 1 , . . . , ft™ to the new conjugate basis ft 1 , . . . , ft™ is inverse to the transi- 
tion matrix S that is used for passing from the old basis ei, . . . , e„ to the new- 
basis ei, ... , e„. 

Proof. In order to prove this theorem we use the biorthogonality relationships 
(1.2). Substituting (2.1) into these relationships, we get 
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The above relationship can be written in matrix form P S = 1. This means that 
P = S~ x . The theorem is proved. □ 

Remember that the inverse transition matrix T is also the inverse matrix for 
S. Therefore, in order to write the complete set of formulas relating two pairs of 
bases in V and V* it is sufficient to know two matrices S and T = S 1-1 : 



s=l 



(2.2) 



E 7 ?-^ ft' = E^-fc r . 

j=l r=l 



Let / be a covector from the conjugate space V*. Let's consider its expansions 
in two conjugate bases ft 1 , ... , ft™ and ft 1 , ... , ft": 

n n 

/ = E^ S < / = E^^ r - ( 2 - 3 ) 

s— 1 r— 1 

The expansions (2.3) also differ from the standard introduced by formula (5.1) in 
Chapter I. To the coordinates of covectors the other standard is applied: they are 
specified by lower indices and are written in row vectors. 

Theorem 2.2. The coordinates of a covector f in two dual bases ft 1 , . . . , ft™ 
and ft 1 , ... , ft™ associated with the bases e\, . . . , e n and §i, . . . , e„ in V are 
related to each other by formulas 

n n 

/ r = E^ s / s , /. = E^ ( 2 - 4 ) 

8=1 j = l 

where S is the direct transition matrix for passing from e\, . . . , e n to the «wavy» 
basis §i, ... , e„, while T = S~ x is the inverse transition matrix. 

PROOF. In order to prove the first relationship (2.4) we substitute the fourth 
expression (2.2) for ft s into the first expansion (2.3): 

n / n \ n / n \ 

/ = E/«- E^-^ r =E E^ s / S )-h r . 

s=l V r=l / r=l \ s=l / 

Then we compare the resulting expansion of / with the second expansion (2.3) and 
derive the first formula (2.4). The second formula (2.4) is derived similarly. □ 

Note that the formulas (2.4) can be derived immediately from the definition 1.5 
and from formula (1.6) without using the conjugate bases. 

Theorem 2.3. The scalar product of a vector v and a covector f is determined 
by their coordinates according to the formula 

n 

(/ 1 v) = E h vi = h v 1 + ... + f n v n . (2.5) 

i=l 
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Proof. In order to prove (2.5) we use the relationship (1.6): 

n n 

</|v>=/(v) = 

i=l i=l 

In (2.5) and in the above calculations / is assumed to be expanded in the basis 
h 1 , ... , h n conjugated to the basis ei, ... , e„, where v is expanded. □ 

§ 3. Orthogonal complements in a dual space. 

Definition 3.1. Let S be a subset in a linear vector space V. The orthogonal 
complement of the subset S in the conjugate space V* is the set S 1 - C V* composed 
by covectors each of which orthogonal to all vectors of S. 

The above definition of the orthogonal complement S 1 - can be expressed by the 
formula S x = {/ G V* : Vv ((« E S) => ((/ | v) = 0))}. 

Theorem 3.1. The operation of constructing orthogonal complements of sub- 
sets S C V in the conjugate space V* possesses the following properties: 

(1) S 1 - is a subspace in V* ; 

(2) Si C S 2 implies (S 2 y C (Si) x ; 

(3) (S) 1 - = S ± , where (S) is the linear span of S; 

(4) (\Js)j = f](S i )\ 
\ iei / iei 

Proof. Let's prove the first item of the theorem for the beginning. For this 
purpose we should verify two conditions from the definition of a subspace. Let 
/i,/2 € S ± , then (/i | v) = and (f 2 | v) = for all v G 5. Therefore, for all 
vectors v G S we derive the equality (/i + / 2 | v) = (/i | v) + (/ 2 | v) =0 which 
means that /i + f 2 € 5^ . 

Now assume that f e S ± . Then (/ | v) = for all vectors v £ S. Hence, for 
the covector a ■ f we defive (a • / | v) = a (/ | v) =0. This means that a ■ f E S ± . 
Thus, the first item in the theorem 3.1 is proved. 

In order to prove the inclusion (S 2 ) ± C (5i) x in the second item of the 
theorem 3.1 we consider an arbitrary covector / of (S 2 ) ± . From the condition 
/ € {S 2 ) ± we derive (/ 1 v) = for any v e S 2 . But S± C S 2 , therefore, the 
equality (/ | v) = holds for any v £ Si. Then / e (Si) 1 - . This means that 
/ ^ (^2)^ implies / G (Si) 1 -. The required inclusion is proved. 

In order to prove the third item of the theorem note that the linear span of S 
comprises this set: S C (S). Applying the item (2) of the theorem, which we have 
already proved, we obtain the inclusion (S) 1 - C Now we need the opposite 
inclusion S 1 - C (S) 1 -. In order to prove it let's remember that the linear span (S) 
consists of all possible linear combinations of the form 

v = ai ■ vi + . . . + a r ■ v r , where v, G S. (3.1) 

Let / G S ± , then (/ | v) = for all v G S. In particular, this applies to the vectors 
Vi in the expansion (3.1), i. e. (/ | Vj) = 0. Then from (3.1) we derive 



(/ I v) = ai (/|vi) + ... + a r (/ I v r ) = 0. 
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This means that (/ | v) = for all v £ (S). This proves the opposite inclusion 
S ± C (S) 1 and, thus, completes the proof of the equality {S) 1 - = S ± . 

Now let's proceed to the proof of the fourth item of the theorem 3.1. For this 
purpose we introduce the following notations: 

S=\JSi, S = f](Si)\ 

iei iei 

Let f e S ± . Then (/ 1 v) = for all veS. But S.cS for any i £ I. Therefore, 
(/ 1 v ) — f° r au v € &i an( i f° r au i & I- This means that / belongs to each of 
the orthogonal complement (Si) 1 - , therefore, it belongs to their intersection. Thus, 
we have proved the inclusion S 1 C S. 

Conversely, from the inclusion / £ (Si) 1 for all i £ I we derive (/ | v) = for 
all v £ Si and for alii £ I. This means that the equality (/ | v) =0 holds for all 
vectors v in the union of all sets Si. This proves the converse inclusion S C S 1 . 
Thus, we have proved that S 1 = S. The theorem is proved. □ 

Definition 3.2. Let S be a subset of conjugate space V*. The orthogonal 
complement of S in V is the set S 1 £ V formed by vectors each of which is 
orthogonal to all covectors of the set S. 

The above definition of the orthogonal complement S 1 C V can be expressed 
by the formula S 1 = {v £ V : V/ ((/ £ S) => ((/ | v) = 0))}. For this orthogonal 
complement one can formulate a theorem quite similar to the theorem 3.1. 

Theorem 3.2. The operation of constructing orthogonal complements of sub- 
sets S C V* in V possesses the following four properties: 

(1) S 1 is a subspace in V; 

(2) Si C S 2 implies (S2) 1 C (S^ 1 ; 

(3) (S) 1 = S 1 , where (S) is a linear span of S; 

(4) (\Js)j =f](Si)\ 
\ iei / iei 

The proof of this theorem almost literally coincides with the proof of the 
theorem 3.1. Therefore, here we omit this proof. 

Theorem 3.3. Let V be a finite-dimensional vector space and suppose that we 
have a subspaces U C V and a subspace W C V*. The the condition W = U 1 in 
the sense of definition 3.1 is equivalent to the condition U = W 1 in the sense of 
definition 3.2. 

Proof. Suppose that W = U 1 in the sense of definition 3.1. Then for any 
w £ W and for any u £ U we have the orthogonality (w | u) = 0. By definition 
W 1 is the set of all vectors v £ V such that (w | v) =0 for all covectors w £ W. 
Hence, u £ U implies u £ W 1 and we have the inclusion U C W 1 . 

However, we need to prove the coincidence U = W 1 . Let's do it by con- 
tradiction. Suppose that U ^ W 1 . Then there is a vector v such that v £ W 1 
and Vo ^ U. In this case we can apply the theorem 1.2 which says that there is 
a linear functional / such that it vanishes on all vectors u £ U and is nonzero on 
the vector v . Then / £ W and (/ 1 v ) 7^ 0, so we have the contradiction to the 
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condition vo € W 1 - . This contradiction proves that U — W 1 - . As a result we have 
proved that W = U 1 - implies U = W ± . 

Now, conversely, let U — W ± . Then for any w G W and for any u g U we 
have the orthogonality (w | u) = 0. By definition U 1 - is the set of all covectors / 
perpendicular to all vectors u g U. Hence, w g W implies w g {7^ and we have 
the inclusion W C U ± . 

Next step is to prove the coincidence W = U ± . We shall do it again by 
contradiction. Assume that W ^ U ± . Then there is a covector f g (7^ such 
that f ^ W. Let's apply the theorem 1.2. It this case it says that there is 
a linear functional ip in V** such that it vanishes on W and is nonzero on the 
covector /o. Remember that we have the canonic isomorphism ft: V — ► y**. We 
apply ft^ 1 to </? and get the vector v = ft -1 ^). Then we take into account (1.13) 
which yields v e U and (/o | v) 7^ 0. This contradicts to the condition fo g £7""-. 
Hence, by contradiction, [7 = W x and £7 = W 1 - implies W = U ± . The theorem is 
completely proved. □ 

The proposition of the theorem 3.3 can be reformulated as follows: in the case 
of a finite-dimensional space V for any subspace U g V and for any subspace 
WeV* the following relationships are valid: 

(U ± ) ± = U, (W ± ) ± = W. (3.2) 

For arbitrary subsets S g V and i? g V* (not subspaces) in the case of a 
finite-dimensional space V we have the relationships 

(s x K = (* X K = <fl>- (3-3) 

These relationships (3.3) are derived from (3.2) by using the item (3) in theo- 
rems 3.1 and 3.2. 

Theorem 3.4. In the case of a finite-dimensional linear vector space V if U is 
a subspace of V or if U is a subspace of V* , then dim U + dim U ± = dim V. 

PROOF. Due to the relationships (3.2) the second case U C V* in the theo- 
rem 3.4 is reduced to the first case U C V if we replace U by U ± . Therefore, we 
consider only the first case U C V. 

Let dimV = n and dimU = s. We choose a basis ei, . . . , e s in the subspace U 
and complete it up to a basis ei, . . . , e„ in the subspace V. The basis ei, ... , e„ 
determines the conjugate basis ft. 1 , . . . , h n in V* . If we specify vectors by their 
coordinates in the basis ei , . . . , e„ and if we specify covectors by their coordinates 
in dual basis h 1 , ... , ft", then we can apply the formula (2.5). 

By construction of the basis ei, ... , e„ the subspace U consists of vectors the 
initial s coordinates of which arc deliberate, while the remaining n — s coordinates 
are equal to zero. Therefore the condition / g U 1 - means that the equality 

s 

i=i 

should be fulfilled identically for any numbers v 1 , ... , v s . This is the case if and 
only if the first s coordinates of the covector / are zero. Other n — s coordinates 
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of / are deliberate. This means that the subspace U 1 - is the linear span of the last 
n — s basis vectors of the conjugate basis: 

u 1 - = (h s+ \... ,h n ). 

For the dimension of the subspace U 1 - this yields dimf/^ = n — s, hence, we have 
the required identity dim U + dim U 1 - = dim V. The theorem is proved □ 

The theorem 3.4 is known as the theorem on the dimension of orthogonal 
complements. As an immediate consequence of this theorem we get 

W = v, ^ = {0}, 

{0} ± = V*, (V^^IO}. v ' 

All these equalities have the transparent interpretation. The first three of the 
equalities (3.4) can be proved immediately without using the finite-dimensionality 
of V. The proof of the last equality (3.4) uses the corollary of the theorem 1.2, 
while this theorem assumes V to be a finite-dimensional space. 

Theorem 3.5. In the case of a finite-dimensional space V for any family of 
subspaces in V or in V* the following relationships are fulfilled 



J> = f](U i )\ fl^ =E(^) x - (3-5) 

iei J iei \ iei ) iei 



Proof. The sum of subspaces is the span of their union. Therefore, the first 
relationship (3.5) is an immediate consequence of the items (3) and (4) in the 
theorems 3.1 and 3.2. The finite-dimensionality of V here is not used. 

The second relationship (3.5) follows from the first one upon substituting Ui by 
(Ui) ± . Indeed, applying (3.2), we derive the equality 

( E(^) x l =n((^) ±= n u i- 

\ iei / iei iei 

Now it is sufficient to pass to orthogonal complements in both sides of this equality 
and apply (3.2) again. The theorem is proved. □ 

§ 4. Conjugate mapping. 

Definition 4.1. Let /: V — > W be a linear mapping from V to W. A linear 
mapping ip : W* — > V* is called a conjugate mapping for / if for any v e V and 
for any w e W* the relationship (<p(w) | v) = (w \ /(v)) is fulfilled. 

The problem of the existence of a conjugate mapping is solved by the defi- 
nition 4.1 itself. Indeed, in order to define a mapping <p : W* — > V* for each 
functional w £ W* we should specify the corresponding functional h — <fi(w) G V*. 
But to specify a functional in V* this means that we should specify its action upon 
an arbitrary vector v G V. In the sense of this reasoning the defining relationship 
for a conjugate mapping is written as follows: 



Mv) = </i|v) = M«;)|v>=M/(v)>. 
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It is easy to verify that the above equality defines a linear functional h = h(v): 

h(yi + v 2 ) = (w | /(vi + v 2 )) = (w | /(vi) + /(v 2 )) = 

= (w | /(vi)) + (w I /(v 2 )) = ft(vi) + /i(v 2 ), 

/i(a • v) = (w | /(a ■ v)) = (to | a • /(v)> = a (to | /(v)) = a ft(v). 

Theorem 4.1. For a linear mapping f : V — > from V to W the conjugate 
mapping ip : W* — > y* is afeo linear. 

Proof. Due to the relationship 4.1 for the conjugate mapping 93: iy* — > y* 
we have the following relationships: 

<p(wi + w 2 )(v) = (toi + w 2 I /(v)) = (toi I /(v))+ 

+ (w 2 I /(v)) = ¥>(wi)(v) + ^(w 2 )(v) = (v(iui) + y(w 2 ))(v), 

y>(a • tu)(v) = (a ■ w I /(v)) = a (to | /(v)) = 

= a </?(w)(v) = (a • <p('u;))(v). 

Since v S V is an arbitrary vector of V from the above calculations we obtain 
+ W2) = <fi(wi) + (p(w 2 ) and ip{a ■ w) = a ■ f(w). This means that the 
conjugate mapping ip is a linear mapping. □ 

As we have seen above, the conjugate mapping : W* — ► y* for a mapping 
/: V — > is unique. It is usually denoted ip = f*. The operation of passing from 
/ to its conjugate mapping /* possesses the following properties: 

(f + 9T=r+9*, (a-f)* = a-r, {fog)* =9* of*. 

The first two properties are naturally called the linearity. The last third property 
makes the operation / — ► /* an analog of the matrix transposition. All three of the 
above properties are proved by direct calculations on the base of the definition 4.1. 
We shall not give these calculations here since in what follows we shall not use the 
above properties at all. 

Theorem 4.2. In the case of finite-dimensional spaces V and W the kernels 
and images of the mappings f : V W and f* : W* — ► V* are related as follows: 

Ker/* = (Im/)-, Ker/ = (Im fT, 

Im/ = (Ker/*) i , Imf* = (Ker/)\ 

Proof. The kernel Ker/* is the set of linear functionals of W* that are 
mapped to the zero functional in V* under the action of the mapping /*. 
Therefore, w € Ker /* is equivalent to the equality /*(w)(v) = for all v e V. As 
a result of simple calculations we obtain 

/>)(v) = </>)|v) = M/(v)>=0. 

Hence, the kernel Ker/* is the set of covectors orthogonal to the vectors of the 
form /(v). But the vectors of the form /(v) e W constitute the image Imf. 
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Therefore, Ker/* = (Im /)-*-. The first relationship (4.1) is proved. In proving this 
relationship we did not use the finite-dimensionality of W. It is valid for infinite 
dimensional spaces as well. 

In order to prove the second relationship we consider the orthogonal complement 
(lmf*) ± . It is formed by the vectors orthogonal to all covectors of the form f*(w): 

o = </»|v) = H/(v)>. 

Using the finite-dimensionality of W, we apply the corollary of the theorem 1.2. 
It says that if (w | /(v)) = for all w e W* , then /(v) = 0. Therefore, we have 
(Im/*)^ = Ker/. The second relationship (4.1) is proved. The third and the 
fourth relationships are derived from the first and the second ones by means of the 
theorem 3.3. Thereby we use the finite-dimensionality of the spaces W and V. □ 

Let the spaces V and W be finite-dimensional. Let's choose a basis e x , ... , e„ 
in V and a basis §i, ... , e m in the space W. This choice uniquely determines 
the conjugate bases h 1 , ... , h n and h 1 , ... , h m in V* and W* . Let's consider a 
mapping / ': V —> W and the conjugate mapping /* : W* — > V*. The matrices of 
the mappings / and /* are determined by the expansions: 

m n 
fc=l 9=1 

The second relationship (4.1) is somewhat different by structure from the first 
one. The matter is that the basis vectors of the dual basis are indexed differently 
(with upper indices). However, this relationship implement the same idea as the 
first one: the mapping is applied to a basis vector of one space and the result is 
expanded in the basis of another space. 

Theorem 4.3. The matrices of the mappings f and f* determined by the re- 
lationships (4.2) are the same, i. e. F* — 

Proof. From the definition of the conjugate mapping we derive 

(h i \f(e j )) = (t(h i )\e j ). (4.3) 

Let's calculate separately the left and the right hand sides of this equality using 
the expansion (4.2) for this purpose: 

m m 

(V I /(e,)) =J2 F " ft \e k )=J2 Ff 51 = Fj, 

k=l k=l 

n n 

i.rm i e,) = £ $* (h* i e,) = % q = 

q=l 9 =1 

Substituting the above expressions back to the formula (4.3), we get the required 
coincidence of the matrices: Fj = <&*•. □ 

Remark. In some theorems of this chapter the restrictions to the finite- 
dimensional case can be removed. However, the prove of such strengthened 
versions of these theorems is based on the axiom of choice (see [1]). 
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BILINEAR AND QUADRATIC FORMS. 



§ 1. Symmetric bilinear forms 
and quadratic forms. Recovery formula. 

Definition 1.1. Let V be a linear vector space over a numeric field K. A 
numeric function y = /(v, w) with two arguments v,w € V and with the values 
in the field IK is called a bilinear form if 

(1) /(vi + v 2 ,w) = /(vi,w) + /(v 2 ,w) for any two vi,v 2 € V; 

(2) f(a ■ v, w) = a /(v, w) for any veF and for any a € K; 

(3) /(v,wi +w 2 ) = /(v,wi) + /(v,w 2 ) for any two vi,v 2 e V; 

(4) /(v, a • w) = a /(v, w) for any veF and for any a e K. 

The bilinear form /(v, w) is linear in its first argument v when the second 
argument w is fixed; it is also linear in its second argument w when the first 
argument v is fixed. 

Definition 1.2. A bilinear form /(v, w) is called a symmetric bilinear form if 
/(v,w) = /(w,v). 

Definition 1.3. A bilinear form /(v, w) is called a skew- symmetric bilinear 
form or an antisymmetric bilinear form if /(v,w) = — /(w,v). 

Having a bilinear form /(v, w), one can produce a symmetric bilinear form: 

/+ (v,w) = / ^ W ^ / ^ V ) . (1.1) 
Similarly, one can produce a skew-symmetric bilinear form: 

/-(v,w) = /(V ' W) 2 /(W ' V) (1.2) 

The operation (1.1) is called the symmetrization of the bilinear form /; the 
operation (1.2) is called the alternation of this bilinear form. Thereby any bilinear 
form is the sum of a symmetric bilinear form and a skew-symmetric one: 

/(v,w)=/ + (v,w) + /_(v,w). (1.3) 

Theorem 1.1. The expansion of a given bilinear form /(v, w) into the sum of 
a symmetric and a skew-symmetric bilinear forms is unique. 

Proof. Let's consider an expansion of /(v,w) into the sum of a symmetric 
and a skew-symmetric bilinear forms 



/(v,w) = h + (v,w) + h_(v,w). 



(1.4) 
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By means of symmetrization and alternation from (1.4) we derive 

f(v,w) + f(w,v) = {h + (v,w) + h + (w,v))+ 

+ (h_(v,w) + h_(w,v)) — 2h + (v,w), 

f(v,w) - f(w,v) = {h+(v,w) - h+(w,v))+ 

+ (h_(v,w) — h_(w 7 vj) = 2h_(v,w), 

Hence, h + — f + and h_ = /_. Therefore, the expansion (1.4) coincides with the 
expansion (1.3). The theorem is proved. □ 

Definition 1.4. A numeric function y = g(v) with one vectorial argument 
ve^is called a quadratic form in a linear vector space V if g(v) = /(v, v) for 
some bilinear form /(v,w). 

If g(v) = /(v, v), then the quadratic form g is said to be generated by the 
bilinear form /. For a skew-symmetric bilinear form we have /_(v, v) = — /_(v, v). 
Hence, /_(v, v) = 0. Then from the expansion (1.3) we derive 

ff(v)=/(v,v) = / + (v,v). (1.5) 

The same quadratic form can be generated by several bilinear forms. The 
relationship (1.5) shows that any quadratic form can be generated by a symmetric 
bilinear form. 

Theorem 1.2. For any quadratic form g(v) there is the unique bilinear form 
/(v,w) that generates g(v). 

Proof. The existence of a symmetric bilinear form /(v, w) generating g(v) 
follows from (1.5). Let's prove the uniqueness of this form. From g(v) = /(v, v) 
and from the symmetry of the form / we derive 

g(\ + w) = /(v + w, v + w) = /(v, v) + /(v, w)+ 

+ /(w, v) + /(w, w) = /(v, v) + 2 /(v, w) + /(w, w). 

Now /(v,v) and /(w, w) in right hand side of this formula can be replaced by 
<?(v) and g(w) respectively. Hence, we get 

/(v,w) = ^ V + W ^ 2 ^-^ . (1.6) 

Formula (1.6) shows that the values of the symmetric bilinear form /(v,w) are 
uniquely determined by the values of the quadratic form <?(v). This proves the 
uniqueness of the form /. □ 

The formula (1.6) is called a recovery formula. Usually, a quadratic form and 
an associated symmetric bilinear form for it both are denoted by the same symbol: 
g(v) = g(v,v). Moreover, when a quadratic form is given, we assume without 
stipulations that the associated symmetric bilinear form g(v, w) is also given. 
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Let /(v, w) be a bilinear form in a finite-dimensional linear vector space V and 
let ei, ... , e„ be a basis in this space. The numbers fa determined by formula 

f ij = f(e i ,e j ) (1.7) 

are called the coordinates or the components of the form / in the basis ei, ... , e„. 
The numbers (1.7) are written in form of a matrix 



F 



fll ■ ■ ■ fin 



fnl • • • fn 



(U 



which is called the matrix of the bilinear form / in the basis ei, ... , e n . For 
the element fij in the matrix (1.8) the first index i specifies the row number, the 
second index j specifies the column number. The matrix of a symmetric bilinear 
form g is also symmetric: gij = gji. Further, saying the matrix of a quadratic 
form g(v), we shall assume the matrix of an associated symmetric bilinear form 
j(v,w). 

Let v , ... , v n and to 1 , ... , w n be coordinates of two vectors v and w in the 
basis ei, . . . , e„. Then the values /(v, w) and <?(v) of a bilinear form and of a 
quadratic form respectively are calculated by the following formulas: 



n n n n 



f(v, H = ^ E fa vP , g(v) = 2. L 9H v l ^ . (1.9) 

i—l j — 1 i—1 j—1 

In the case when g^ is a diagonal matrix, the formula for g(v) contains only the 
squares of coordinates of a vector v: 

g(v)=g 11 (v 1 ) 2 + ...+g nn (v n ) 2 . (1.10) 

This supports the term «quadratic form». Bringing a quadratic form to the form 
(1.10) by means of choosing proper basis ei, ... , e„ in a linear space V is one of 
the problems which are solved in the theory of quadratic form. 

Let ei, ... , e„ and ei, ... ,e„ be two bases in a linear vector space V. Let's 
denote by S the transition matrix for passing from the first basis to the second 
one. Denote T = S^ 1 . From (1.7) we easily derive the formula relating the 
components of a bilinear form /(v, w) these two bases. For this purpose it is 
sufficient to substitute the relationship (5.8) of Chapter I into the formula (1.7) 
and use the bilinearity of the form /(v, w): 

n n n n 

fa = /(e*, ej) = ]T ]T T? Tj f(e k , e q ) = £ £ T? T? f kq . 

k=lq=l k=lq=l 

The reverse formula expressing f/. q through fa is derived similarly: 

n n n n 
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In matrix form these relationships are written as follows: 



p = T tr FT, 



F = S tr FS. 



(1.12) 



Here S tr and T tr are two matrices obtained from S and T by transposition. 



Definition 2.1. Two vectors v and w in a linear vector space V are called 
orthogonal to each other with respect to the quadratic form g if g(v, w) = 0. 

Definition 2.2. Let 5 be a subset of a linear vector space V. The orthogonal 
complement of the subset 5 with respect to a quadratic form g(v) is the set of 
vectors each of which is orthogonal to all vectors of S with respect that quadratic 
form g. The orthogonal complement of S is denoted S± C V. 

The orthogonal complement of a subset S with respect to a quadratic form g 
can be defined formally: S± = {v g V : Vw((w g S) =>■ (ff(v,w) = 0))}. For the 
orthogonal complements determined by a quadratic form g(v) there is a theorem 
analogous to theorems 3.1 and 3.2 in Chapter III. 

Theorem 2.1. The operation of constructing orthogonal complements of sub- 
sets S CV with respect to a quadratic form g possesses the following properties: 

(1) S ± is a subspace in V; 

(2) 5x c S 2 imphes (S 2 ) ± C (Si) x ; 

(3) (S)± — S ± , where (S) is the linear span of S; 



\ iei iei 

PROOF. Let's prove the first item in the theorem for the beginning. For this 
purpose we should verify two conditions from the definition of a subspace. 

Let Vi,V2 € S±. Then g(v\,w) = and g(v 2 ,w) = for all w g S. Hence, 
for all w g S we have g(vi + v 2 , w) = g(vi,w) + ,g(v 2 , w) = 0. This means that 
v i + v 2 S so the first condition is verified. 

Now let v g S ± . Then g(v,w) = for all w g S. Hence, for the vector a ■ v 
we derive g(a • v, w) = ag(v, w) = 0. This means that a ■ v g S ± . Thus, the first 
item of the theorem 2.1 is proved. 

In order to prove the inclusion (S , 2 ) i C (Si)± in the second item of the 
theorem 2.1 we consider an arbitrary vector v in (5 2 )j_. From the condition 
v g (S' 2 ) i we get g(v,w) = for any w g S* 2 . But Si C S* 2 , therefore, the equality 
<7(v, w) = is fulfilled for any w g Si. Then v g (Si) ± . Thus, v g (5 2 )^ implies 
v g (Si)j_. This proves the required inclusion. 

Now let's proceed to the third item of the theorem. Note that the linear span of 
S comprises this set: S C (S). Applying the second item of the theorem, which is 
already proved, we get the inclusion (S) ± C S ± . In order to prove the coincidence 
(S) ± — S ± we have to prove the converse inclusion S ± C (S} ± . For this purpose 
let's remember that the linear span (S) is formed by the linear combinations 



§ 2. Orthogonal complements 
with respect to a quadratic form. 




w = a\ ■ Wi + . . . + a r ■ w r , where w, g S. 



(2.1) 
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Let v e S±, then g(v,w) = for all w £ S. In particular, this is true for the 
vectors Wj in the expansion (2.1): g(v, Wj) = 0. Then from (2.1) we derive 



Hence, g(v, w) = for all w € (S). This proves the converse inclusion S± C (S) ± 
and thus completes the proof of the coincidence (S) ± = S ± . 

In proving the fourth item of the theorem we introduce the following notations: 



Let v e S±. Then g(v, w) = for all w e S. But 5,cS for any i e I. Therefore, 
g(y, w) = for all w € Si and for all i E I. This means that v belongs to each 
of the orthogonal complements (Si) ± and, hence, it belongs to their intersection. 
This proves the inclusion S± C S. 

Conversely, if v e (Si) ± for all i £ I, then g(v, w) = for all w e Si and for all 
i e I. Hence, g(v, w) = for any vector w in the union of all sets Si. This proves 
the converse inclusion S C S±. 

The above two inclusions S± C S and S C S± prove the coincidence of two sets 
S ± = S. The theorem 2.1 is proved. □ 

Definition 2.3. The kernel of a quadratic form g(v) in a linear vector space 

V is the set Kerg — V± formed by vectors orthogonal to each vector of the space 

V with respect to the form g. 

Definition 2.4. A quadratic form with nontrivial kernel Ker g ^ {0} is called 
a degenerate quadratic form. Otherwise, if Kerg = {0}, then the form g is called 
a non- degenerate quadratic form. 

Due to the theorem 2.1 the kernel of a form g(v) is a subspace of the space V, 
where it is defined. The term «kernel» is not an occasional choice for denoting 
the set V ± . Each quadratic form is associated with some mapping, for which the 
subspace V± is the kernel. 

Definition 2.5. An associated mapping of a quadratic form g is the mapping 
a g : V —> V* that takes each vector v of the space V to the linear functional f v in 
the conjugate space V* determined by the relationship 



The associated mapping a g : V — > V* is linear, this fact is immediate from the 
bilincarity of the form g. Its kernel Kera 9 coincides with the kernel of the form 
g. Indeed, the condition v € Kera 9 means that the functional f v determined by 
(2.2) is identically zero. Hence, v is orthogonal to all vectors w e V with respect 
to the quadratic form g(v). 

The associated mapping a g relates orthogonal complements S ± determined by 
the quadratic form g and and orthogonal complements S 1 - in a dual space, which 
we considered earlier in Chapter III. 



g(v, w) = ax g(v, wi) + ... + a r g(v, w r ) = 0. 





/ v (w) = g(v, w) for all w € V. 



(2-2) 



§2. ORTHOGONAL COMPLEMENTS 



105 



Theorem 2.2. For any subset S C V and for any quadratic form g(v) in a 
linear vector space V the set S± is the total preimage of the set S 1 - under the 
associated mapping a g , i. e. S± — a~ g 1 (S ± ). 

Proof. The condition v e S± means that g(v, w) = for all w G S. But this 
equality can be rewritten in the following way: 

g(v,w) = /v(w) = a g (v)(w) = (a g (v) | w) = for all w e S. 

Hence, the condition v e S ± is equivalent to a 9 (v) G S ± . This proves the required 
equality S ± = a" 1 (S^). □ 

According to the definition 2.3, vectors of the kernel Ker g are orthogonal to all 
vectors of the space V with respect to the form g. Therefore (Kerg)± = V. If we 
apply the result of the theorem 2.2 to the kernel S = Ker c/, we get 

a- 1 ((Kcrgy) = (Kerg) ± = V. 

This result becomes more clear if we write it in the following equivalent form: 

Imo 9 = a g (V) C {Kerg^. (2.3) 

Corollary 1. The image of the associated mapping a g is enclosed into the 
orthogonal complement to its kernel (Kerag)^, i. e. Imo 9 C (Kerag)^. 

This corollary of the theorem 2.2 is derived from the formula (2.3) if we take 
into account Ker g = Kera 9 . For a quadratic form g in a finite-dimensional space 
V it can be strengthened. 

Corollary 2. For a quadratic form g(v) in a finite-dimensional linear vector 
space V the image of the associated mapping a g : V — > V* coincides with the 
orthogonal complement of its kernel Ker a g : 

Imo 9 = (Kcra g )\ (2.4) 

PROOF. Using the theorem 9.4 from Chapter I, we calculate the dimension of 
the image Im a g of the associated mapping: 

dim(Ima s ) = dimV" — dim(Kcra 9 ). 

The dimension of the orthogonal complement of Kera ff in the dual space is 
determined by the theorem 3.4 in Chapter III: 

dim(Kera ff ) i = dimU — dim(Kera g ). 

As we can see, the dimensions of these two subspaces are equal to each other. 
Therefore, we can apply the above corollary 1 and the item (3) of the theorem 4.5 
from Chapter I. As a result we get the required equality (2.4). □ 
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Theorem 2.3. Let U C. V be a subspace of a finite-dimensional space V com- 
prising the kernel of a quadratic form g. For any vector v ^ U there exists a vector 
w e V such that g(v, w) ^ and g(v, u) = for all u £ U. 

Proof. This theorem is an analog of the theorem 1.2 from Chapter III. It's 
proof is essentially based on that theorem. Applying the theorem 1.2 from 
Chapter III, we get that there exist a linear functional / e V* such that /(v) ^ 
and /(u) = (/ | u) = for all u € U. Due to the last condition this functional 
/ belongs to the orthogonal complement U ± . From the inclusion Kerg C U, 
applying the item (2) of the theorem 3.1 from Chapter III, we get U 1 - C (Kerg) ± . 
Hence, we conclude that / <G (Keig) ± . 

Now we apply the corollary 2 from the theorem 2.2. From this corollary we 
obtain that (Kerg)^ = Imo 9 . Hence, / G Imo 9 and there is a vector w e V that 
is taken to / by the associated mapping a g , i.e. f — a g (w). Then 

g(v,w) = a 9 (w)(v) = /(v) ^ 0, 

3 (v, u) = a s (w)(u) = /(u) = for all u e U. 

Due to these relationship we find that w is the very vector that we need to 
complete the proof of the theorem. □ 

Theorem 2.4. Let V be a finite-dimensional linear vector space and let U and 
W be two subspaces ofV comprising the kernel Ker g of a quadratic form g. Then 
the conditions W = U ± and U = W ± are equivalent to each other. 

Proof. The theorem 2.4 is an analog of the theorem 3.3 from Chapter III. The 
proofs of these two theorems are also very similar. 

Suppose that the condition W — U± is fulfilled. Then for any vector w e W 
and for any vector u e U we have the relationship g(w, u) = 0. The set W± is 
formed by vectors orthogonal to all vectors of W with respect to the quadratic 
form g. Therefore, we have the inclusion U C W ± . 

Further proof is by contradiction. Assume that U ^ W ± . Then there is a vector 
vo such that v € W ± and v ^ U. In this situation we can apply the theorem 2.3 
which says that there exists a vector v such that <7(v,v ) ^ and g(v, u) = for 
all u £ U . The latter condition means that v e U ± = W. Then the other condition 
g(v, vo) ^ contradicts to the initial choice v € W ± . This contradiction shows 
that the assumption U ^ W ± is not true and we have the coincidence U — W±. 
Thus, W — U ± implies U — W ± . We can swap U and W and obtain that U = W ± 
implies W = U ± . Hence, these two conditions are equivalent. □ 

The proposition of the theorem 2.3 can be reformulated as follows: for a 
subspace U C V in a finite-dimensional space V the condition Kcr g C U means 
that double orthogonal complement of U coincides with that space: (U±)± = U. 
For an arbitrary subset S C V of a finite-dimensional space V one can derive 

(S ± ) ± = (S)+Kerg. (2.5) 

Let's prove the relationship (2.5). Note that vectors of the kernel Kerg are 
orthogonal to all vectors of V. Therefore, joining the vectors of the kernel Ker g 
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to S, we do not change the orthogonal complement of this subset: 

S ± = (SUKerg) ± . 

Now let's apply the item (3) of the theorem 2.1. This yields 

S ± = (SuKerg) ± = (SuKers), = ((S)+Kevg) ± . 

The subspace U = (S) + Kcr g comprises the kernel of the form g. Therefore, 
(U±)± = U. This completes the proof of the relationship (2.5): 

(SJx = (((S) + Kerg) ± ) ± = (S) + Kcvg. 

Theorem 2.5. In the case of finite-dimensional linear vector space V for any 
subspace U ofVwe have the equality 

dimU + dimU ± = dim V + dim(Kcr.g n U), (2.6) 

where U ± is the orthogonal complement ofU with respect to the form g. 

Proof. The vectors of the kernel Ker g are orthogonal to all vectors of the 
space V, therefore, joining them to U, wc do not change the orthogonal comple- 
ment U ± . Let's denote W — U + Ker g. Then U± = W±. Applying the theorem 6.4 
from Chapter I, for the dimension of W we derive the formula 

dim W = dim U + dim(Kcr g) - dim(Ker g fl U). (2.7) 

Now let's apply the theorem 2.2 to the subset S = W . This yields W ± = aJ 1 (W ± ). 
Note that Ker g C W, this differs W from the initial subspace U. Let's apply the 
item (2) of the theorem 3.1 to the inclusion Ker g C W and take into account the 
corollary 2 of the theorem 2.2. This yields 

W ± c (Kergy =Ima g . 

The inclusion W 1 - C Imo 9 means that the preimage of each element / € W 1 - 
under the mapping a g is not empty, while the equality W ± = a~ g 1 (W ± ) shows 
that such preimage is enclosed into W±. Therefore, W ± = a~ 1 (W ± ) implies the 
equality a g (W ± ) = W ± . 

Now let's consider the restriction of the associated mapping a g to the subspace 
W±. We denote this restriction by a: 

a: W ± -» V*. (2.8) 

The kernel of the mapping (2.8) coincides with the kernel of the non-restricted 
mapping a g since Ker a g = Ker g C W ± . For the image of this mapping we have 

Ima = a g (W ± ) = W ± . 

Let's apply the theorem on the sum of dimensions of the kernel and the image (see 
theorem 9.4 in Chapter I) to the mapping a: 

dim(Ker g) + dim W 1 - = dim W ± (2.9) 
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In order to determine the dimension of W 1 - we apply the relationship 

dim W + dim W ± = dim V (2.10) 

which follows from the theorem 3.4 of Chapter III. Now let's add the relationships 
(2.7) and (2.9) and subtract the relationship (2.10). Taking into account the 
coincidence W ± = U±, we get the required equality (2.6). □ 

The analogs of the relationships (3.4) from Chapter III in present case are the 
relationships {0}± = V and V± = Kerg. 

Theorem 2.6. In the case of finite-dimensional linear vector space V equipped 
with a quadratic form g for any family of subspaces in V, each of which comprises 
the kernel Ker g, the following relationships are fulfilled: 

(e^j =ri(^)x, (n^) =£(^)x. (2-ii) 

V iei / J- iei K iei ) - 1 iei 



PROOF. In proving the first relationship (2.11) the condition Ker g C Ui is 
inessential. This relationship is derived from the items (3) and (4) of the theo- 
rem 2.1 if we take into account that the sum of subspaces is the linear span of the 
union of these subspaces. 

The second relationship (2.11) is derived from the first one. From the condition 
Ker g £ Ui we derive that ((Ui) ± ) ± = Ui (sec theorem 2.4). Let's denote (Ui) ± = V{ 
and apply the first relationship (2.11) to the family of subsets Vc 

V iei J 1 - \ iei / - 1 iei iei 

Now it is sufficient to pass to orthogonal complements in left and right hand sides 
of the above equality and apply the theorem 2.4 again. This yields the required 
equality (2.11). The theorem is proved. □ 

§ 3. Transformation of a quadratic form 
to its canonic form. Inertia indices and signature. 

Definition 3.1. A subspace U in a linear vector space V is called regular with 
respect to a quadratic form g if U n U± C Kerg. 

Theorem 3.1. Let U be a subspace in a finite-dimensional space V regular with 
respect to a quadratic form g. Then U + U ± = V. 

Proof. Let's denote W = U + U ± and then let's calculate the dimension of the 
subspace W applying the theorem 6.4 from Chapter I: 

dim W = dim U + dim U ± - dim(£/ nU ± ). 

The vectors of the kernel Ker g are perpendicular to all vectors of the space V . 
Therefore, Kerg C U±. Moreover, due to the regularity of U with respect to the 
form g we have U n U± C Ker g. Therefore, we derive 

U H U ± = (U n U ± ) n Kerg = U n (Z7 X n Kerg) = U n Ker g. 
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Because of the equality U P\ U± = U n Ker g the above formula for the dimension 
of the subspace W can be written as follows: 



Let's compare (3.1) with the formula (2.6) from the theorem 2.5. This comparison 
yields dimVF = dimF. Now, applying the item (3) of the theorem 4.5 from 
Chapter I, we get W = V. The theorem is proved. □ 

Theorem 3.2. Let U be a subspace of a Unite-dimensional space V regular with 
respect to a quadratic form g. If U± ^ Ker g, then there exists a vector v e U ± 
such that g(v) ^ 0. 

PROOF. The proof is by contradiction. Assume that there is no vector v g U± 
such that g(v) ^ 0. Then the numeric function g(v) is identically zero in the 
subspace U ± . Due to the recovery formula (1.6) the numeric function g(v,w) is 
also identically zero for all v,w g U±. 

Now let's apply the theorem 3.1 and expand an arbitrary vector x g V into a 
sum of two vectors x = u + w, where u g U and w g U±. Then for an arbitrary 
vector v of the subspace U± we derive 



The first summand g(v, u) in right hand side of the above equality is zero since the 
subspaces U and U± are orthogonal to each other. The second summand g(v, w) 
is zero due to our assumption in the beginning of the proof. Since g(v, x) = for 
an arbitrary vector x g V, we get v g Ker g. But v is an arbitrary vector of the 
subspace U±. Therefore, U± C Ker g. The converse inclusion Kerg C U± is always 
valid. Hence, U± = Kerg, which contradicts the hypothesis of the theorem. This 
contradiction means that the assumption, which we have made in the beginning of 
our proof, is not valid and, thus, it proves the existence of a vector v g U ± such 
that g(v) ^ 0. The theorem is proved. □ 

Theorem 3.3. For any quadratic form g in a finite- dimensional vector space V 
there exists a basis ei , ... , e„ such that the matrix of g is diagonal in this basis. 

Proof. The case g — is trivial. The matrix of the zero quadratic form g 
is purely zero in any basis. The square n x n matrix, which is purely zero, is 
obviously a diagonal matrix. 

Suppose that g ^ 0. We shall prove the theorem by induction on the dimension 
of the space dim V = n. In the case n = 1 the proposition of the theorem is trivial: 
any lxl matrix is a diagonal matrix. 

Suppose that the theorem is valid for any quadratic form in any space of the 
dimension less than n. Let's consider the subspace U = Kerg. It is regular with 
respect to the form g and U ± = V. Therefore, we can apply the theorem 3.2. 
According to this theorem, there exists a vector v g' U such that g(v ) ^ 0. Let's 
consider the subspace W obtained by joining v to U = Kerg: 



dim W = dim U + dim U± — dim(£7 n Kerg). 



(3.1) 



5( v > x ) = 5( v , u + w) = g(v, u) + g(v, w) = + = 0. 



W = Kerg + (v ) = U 8 (v ). 



(3.2) 



This subspace W determines the following two cases: W = V or W ^ V. 
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In the case W — V we choose a basis ei, ... , e s in the kernel Ker g and 
complete it by one additional vector e s+ i = vo- As a result we get the basis 
in V. The matrix of the quadratic form g in this basis is a matrix almost 
completely filled with zeros, indeed, for i = 1, ... , s and j = 1, . . . , s + 1 we have 
gij = gji = g(ei 7 ej) = since ej £ Kerg. The only nonzero element is g s +i s +i, it 
is a diagonal element: g s+ i s+ i = g(e s+1 , e s+1 ) = g(v ) ^ 0. 

In the case W ^ V we consider the intersection W n W ± . Let welffl W ± . 
Then from (3.2) we derive w = a ■ v + u, where u £ Ker g. Since w is a vector of 
W ans simultaneously it is a vector of W± , it should be orthogonal to itself with 
respect to the quadratic form g: 

g(w, w) = g(a ■ v + u, a ■ v + u) = 

(3.3) 

= a g{v ,v ) + 2ag(v ,u) + g(u,u) =0. 

But u £ Kerg, therefore, g(vo,u) = and g(u, u) = 0, while <7(vo,vo) = <7(vo) 7^ 
0. Hence, from (3.3) we get a = 0. This means that w = u e Kerg. Thus, 
we have proved the inclusion W n W 7 ^ C Kerg, which means the regularity of the 
subspace W with respect to the quadratic form g. 

Now let's apply the theorem 3.1. It yields the expansion V = W + W ± . Note 
that v £ W, but v ^ W f) W ± . This follows from g(v , v ) 7^ 0. Hence, v ^ Wj_ 
and W ± 7^ V. This means that the dimension of the subspace W ± is less than n. 
The formula (2.6) yield the exact value of this dimension 

dimW^i = dim V + dim([7 n Ker.g) - dimf7 = n- 1. 

Let's consider the restriction of g to the subspace W±. We can apply the inductive 
hypothesis to g in W±. Let ei, ... , e„_i be a basis of the subspace W± in which 
the matrix of the restriction of g to W± is diagonal: 

9ij = 9 i = g{ei,ej) = for i < j < n - 1. (3.4) 

We complete this basis by one vector e„ = vo- Since vo ^ W± the extended 
system of vectors ei, ... , e„ is linearly independent and, hence, is a basis of V. 
Let's find the matrix of the quadratic form g in the extended basis. For the 
elements in the extension of this matrix we obtain the relationships 

gin = 9ni = g{ei,e n ) =0 for i < n. (3.5) 

They follow from the orthogonality of ej and e„ in (3.5). Indeed, e„ £ W and 
£ W±. The relationships (3.4) and (3.5) taken together mean that the matrix 
of the quadratic form g is diagonal in the basis ei , ... , e„ . The inductive step is 
over and the theorem is completely proved. □ 

Let g be a quadratic form in a finite dimensional space V and let ei, ... , e„ 
be a basis in which the matrix of g is diagonal. Then the value of g(v) can be 
calculated by formula (1.10). A part of the diagonal elements gn, ... , g nn can be 
equal to zero. Let's denote by s the number of such elements. We can rcnumcratc 
the basis vectors ei, . . . , e„ so that 



5ii = • • • = g ss = 0. 



(3.6) 
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The first s vectors of the basis, which correspond to the matrix elements (3.6), 
belong to the kernel of the form Ker g. Indeed, if w = e, for i = 1, . . . , s, then 
g(v, w) = for all vectors v e V. This fact can be easily derived with the use of 
formulas (1.9). 

Conversely, suppose that w e Ker g. Then for an arbitrary vector v e V we 
have the following relationships: 

n n n 

g{v,w) = ^2^2gijV* w 3 = ^2 9uv l w l =0. 

i—l j—1 i—s+1 

Since v e V is an arbitrary vector, the above equality should be fulfilled identically 
in v s+1 , ... ,v n . But gu ^ for i ^ s + 1, therefore, w s+1 = . . . = w n = 0. From 
these equalities for the vector w we derive 

w = w ■ ei + . . . + w s ■ e s . 

The conclusion is that any vector w of the kernel Ker<? can be expanded into 
a linear combination of the first s basis vectors. Hence, these basis vectors 
ei, . . . , e s form a basis in Ker g. The above considerations prove the following 
proposition that we present in the form of a theorem. 

Theorem 3.4. The number of zeros on the diagonal of the matrix of a quadratic 
form g, brought to the diagonal form, is a geometric invariant of the form g. It does 
not depend on the method used for bringing this matrix to a diagonal form and 
coincides with the dimension of the kernel of the quadratic form: s = dim(Kerg). 

Definition 3.2. The number s = dim(Kerg) is called the zero inertia index of 
a quadratic form g. 

Let g be a quadratic form in a linear vector space over the field of complex 
numbers C such that its matrix is diagonal a basis ei, ... , e„. Suppose that s is 
the zero inertia index of the quadratic form g. Without loss of generality we can 
assume that the first s basis vectors ei, ... , e s form a basis in the kernel Kerg. 
We define the numbers 71 , ... , 7„ by means of formula 

f 1 for i ^ s, 

ll I Vdii for * > s - 

Remember that for any complex number one can take its square root which is 
again a complex number. Complex numbers (3.7) arc nonzero. We use them in 
order to construct the new basis: 

e 4 = (nr 1 ■ e it i = 1, . . . ,n. (3.8) 

The matrix of the quadratic form g in the new basis (3.8) is again a diagonal 
matrix. Indeed, we can explicitly calculate the matrix elements: 

hi =9{ei, ej) = {li 7j) _1 9ij =0 for i ^ j. 



(3.7) 
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For the diagonal elements of the matrix of g we derive 

_ . , ,_ 2 J for i < s, 
( 1 lor i > s. 

The matrix of the quadratic form g in the basis e 1; ... , e„ has the following form 
which is used to be called the canonic form of the matrix of a quadratic form over 
the field of complex numbers C: 




The matrix Q in (3.9) is a diagonal matrix, its diagonal is filled with s zeros and 
n — s ones, where s = dimKer g. 

In the case of a linear vector space over the field of real numbers R the canonic 
form of the matrix of a quadratic form is different from (3.9). Let ei, ... , e„ be a 
basis in which the matrix of g is diagonal. Diagonal elements of this matrix now 
is subdivided into three groups: zero elements, positive elements, and negative 
elements. If s is the number of zero elements and r is the number of positive 
elements, then remaining n — s — r elements on the diagonal are negative numbers. 
Without loss of generality we can assume that the basis vectors ei, ... , e„ are 
enumerated so that gu — for i = 1, . . . , s and gu > for i = s + 1, ... , s + r. 
Then gu < for i = s + r + 1, . . . , n. In the field of reals we can take the square 
root only of non-negative numbers. Therefore, here we define 71, ... , 7„ a little 
bit differently than it was done in (3.7) for complex numbers: 



li 



1 for i < s, 

\f\gu\ for i > s. 



(3.10) 



By means of (3.10) we define new basis §1, . . . , e„ using the formulas (3.9). 
Here is the matrix of the quadratic form g in this basis: 







-1 



(3.11) 



-1 



CopyRight © Sharipov R.A., 2004. 
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Definition 3.3. The formula (3.11) defines the canonic form of the matrix of 
a quadratic form g in a space over the real numbers R. The integers r p and r n 
that determine the number of plus ones and the number of minus ones on the 
diagonal of the matrix (3.10) are called the positive inertia index and the negative 
inertia index of the quadratic form g respectively. 

Theorem 3.5. The positive and the negative inertia indices r p and r n of a 
quadratic form g in a space over the field of real numbers R are geometric invariants 
of g. They do not depend on a particular way how the matrix of g was brought to 
the diagonal form. 

PROOF. Let ei, . . . , e n be a basis of a space V in which the matrix of g has 
the canonic form (3.11). Let's consider the following subspaces: 

U+ = (ei,... ,e s+rp ), U- = (e s+rp+1 , . . . ,e n ). (3.12) 

The intersection of U+ and U- is trivial, dim?7 + = s + r p , dimf/_ = r n , and for 
their sum we have U + U- = V. 

Let's take a vector v e U + . The value of the quadratic form g for that vector is 
determined by the matrix (3.11) according to the formula (1.10): 

s+r p 

9(v) = £ (v*f. 

»=s+l 

The sum of squares in the right hand side of this equality is a non-negative 
quantity, i.e. g{v) ^ for all v e U+. 

Now let's take a vector v e U-. For this vector the formula (1.10) is written as 

n 

<?(v)= ]T (-K) 2 )- 

i=s+r p + l 

If v ^ 0, then at least one summand in right hand side is nonzero. Hence, g(v) < 
for all nonzero vectors of the subspace U- . 

Suppose that ei, ... , e„ is some other basis in which the matrix of g has the 
canonic form. Denote by s, f p , and f n the inertia indices of g in this basis. The 
zero inertia indices in both bases are the same s = s since they are determined by 
the kernel of g: s = dim(Kerg) and s = dim(Kcr g). 

Let's prove the coincidence of the positive and the negative inertia indices in 
two bases. For this purpose we consider the subspaces U+ and U- determined by 
the relationships of the form (3.12) but for the «wavy» basis §i, ... , e„. If we 
assume that r p ^ r p , then r p > f p or r p < r p . For the sake of certainty suppose 
that r p > f p . Then we calculate the dimensions of U+ and J7_: 

dim U + = s + r p , dim U_ = f n = n — s — f p . 

For the sum of dimensions of these two subspaces U+ and C/_ we get the equality 
dim U+ + dim {/_ = n + (r p — f p ). Due to the above assumption r p > f p we derive 



dim U+ + dim U- > dim V 



(3.13) 
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From the natural inclusion U+ + U- C V we get dim(£7 + + £/_)< dimF. Using 
this estimate together with the inequality (3.13) and applying the theorem 6.4 of 
Chapter I to them, we derive dim(t/+ n U-) > 0. Hence, the intersection U+ n U- 
is nonzero, it contains a nonzero vector v € U+ fl E/_. From the conditions v G {/+ 
and v e [/_ we obtain two inequalities 

<?(v) ^ 0, ,g(v) < 

contradicting each other. This contradiction shows that our assumption r p ^ f p is 
not valid and the inertia indices r p and f p do coincide. From r p — r p and s = s 
then we derive r n = f n . The theorem is proved. □ 

Definition 3.4. The total set of inertia indices is called the signature of a 
quadratic form. In the case of a quadratic form in complex space (K = C) the 
signature is formed by two numbers (s, n — s), in the case of real space (K = R) it 
is formed by three numbers (s,r p ,r n ). 

In the case of a linear space over the field of rational numbers K = Q we 
can also diagonalize the matrix of a quadratic form and subdivide the diagonal 
elements into three parts: positive, negative, and zero elements. This determines 
the numbers s, r p , and r n , which are geometric invariants of g, and we can define 
its signature. 

However, in the case I = Q we cannot reduce the nonzero diagonal elements 
to plus ones and minus ones only. Therefore, the number of geometric invariants 
in this case is greater than 3. We shall not look for the complete set of geometric 
invariants of a quadratic form in the case K = Q and we shall not construct their 
theory since this would lead us to the number theory toward the problems of 
divisibility, primality, factorization of integers, etc. 

§ 4. Positive quadratic forms. 
Silvester's criterion. 

In this section we consider quadratic forms in linear vector spaces over the field 
of real numbers R. However, almost all results of this section remain valid for 
quadratic forms in rational vector spaces as well. 

Definition 4.1. A quadratic form g in a space V over the field R is called a 
positive form if g(v) > for any nonzero vector veV. 

Theorem 4.1. A quadratic form g in a finite-dimensional space V is positive if 
and only if the numbers s and r n in its signature (s,r p ,r n ) are equal to zero. 

Proof. Let g be a positive quadratic form and let e 1; . . . , e„ be a basis in 
which the matrix of g has the canonic form (3.11). If s ^ then for the basis 
vector ei ^ we would get g(ei) = gu = 0, which would contradict the positivity 
of g. If r n ^ 0, then for the basis vector e„ ^ we would get g(e n ) — g nn = —1, 
which would also contradict the positivity of g. Hence, s — r n = 0. 

Now, conversely, let s = r n = 0. Then in the basis e l7 ... , e„, in which the 
matrix of g has the form (3.11), its value g(v) is the sum of squares 



5 ( u ) = ( u 1 ) 2 + ... + K) 2 , 
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where v 1 , . . . , v n are coordinates of a vector v. This formula follows from the 
formula (1.10). For a nonzero vector at least one of its coordinates is nonzero. 
Hence, g(v) > 0. This proves the positivity of g and thus completes the proof of 
the theorem in whole. □ 

The condition s = Kcr g obtained in the theorem 3.4 and the condition s = 
mean that a positive form g in a finite-dimensional space V is non-degenerate: 
Kerg = {0}. This fact is valid for a form in an infinite-dimensional space as well. 

Theorem 4.2. Any positive quadratic form g is non-degenerate. 

Proof. If Kerg ^ {0} then there is a nonzero vector v e Ken g. The vector v 
of the kernel is orthogonal to all vectors of the space V. Hence, it is orthogonal to 
itself: g(v) = g(v, v) = 0. If so, this fact contradicts the positivity of the form g. 
Therefore, any positive form g should be non-degenerate. □ 

Theorem 4.3. Any subspace U C V is regular with respect to a positive qua- 
dratic form g in a linear vector space V. 

Proof. Since the kernel Kerg of a positive form g is zero, the regularity of 
a subspace U with respect to g is equivalent to the equality U C\U ± = {0} (see 
definition 3.1). Let's prove this equality. Let v be an arbitrary vector of the 
intersection U fl U ± . From v G U ± we derive that it is orthogonal to all vectors of 
U. Hence, it is also orthogonal to itself since veU. Therefore, g(v) = g(v, v) = 0. 
Due to positivity of g the equality g(v) = holds only for the zero vector v = 0. 
Thus, we get UT\U± = {0}. The theorem is proved. □ 

Theorem 4.4. For any subspace U c V and for any positive quadratic form g 
in a finite- dimensional space V there is an expansion V = U ®U ± . 

Proof. The expansion V = U + U ± follows from the theorem 3.1. We need 
only to prove that the sum in this expansion is a direct sum. For the sum of the 
dimensions of U and U± from the theorem 2.5 due to the triviality of the kernel 
Kerg = {0} of a positive quadratic form g we derive 



dim U + dim U ± = dim V. 



Due to this equality in order to complete the proof it is sufficient to apply the 
theorem 6.3 of Chapter I. □ 

Let g be a quadratic form in a finite-dimensional space V over the field of 
real numbers R. Let's choose an arbitrary basis ei, . . . , e„ in V and then let's 
construct the matrix of the quadratic form g: 



9ii 



gm 



gin 



(4.1) 



Let's delete the last n — k columns and the last n — k raws in the above matrix 
(4.1). The determinant of the matrix thus obtained is called the k-th principal 
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minor of the matrix Q. We denote this determinant by Mk: 



9ii 



9ik 



M k = det 



(4.2) 



Ski 



9kk 



The n-th principal minor M n coincides with the determinant of the matrix Q . 

Theorem 4.5. Let g be a positive quadratic form in a finite-dimensional space 
V . Then the determinant of the matrix of g in an arbitrary basis ofV is positive. 

PROOF. For the beginning we consider a canonic basis ei, ... , e„ in which the 
matrix of g has the canonic form (3.11). According to the theorem 4.1, the matrix 
of a positive quadratic form g in a canonic basis is the unit matrix. Hence, its 
determinant is equal to unity and thus it is positive: det Q — 1 > 0. 

Now let ei, ... , e„ be an arbitrary basis and let S be the transition matrix for 
passing from ei, . . . , e n to ei, . . . , e n . Applying the formula (1.12), we get 



In a linear vector space V over the real numbers M the elements of any transition 
matrix S are real numbers. Its determinant is also a nonzero real number. 
Therefore, (detS*) 2 is a positive number. The theorem is proved. □ 

Now again let ei, ... , e„ be an arbitrary basis of V and let g^ be the matrix 
of a positive quadratic form g in this basis. Let's consider the subspace 



Let's denote by hk the restriction of g to the subspace Uk- The matrix of the form 
hk in the basis ei, ... , e k coincides with upper left diagonal block in the matrix 
of the initial form g. This is the very block that determines the fc-th principal 
minor Mk in the formula (4.2). It is clear that the restriction of a positive form g 
to any subspace is again a positive quadratic form. Therefore, we can apply the 
theorem 4.5 to the form hk- This yields Mk > 0. 

Conclusion: the positivity of all principal minors (4.2) is a necessary condition 
for the positivity of a quadratic form g itself. As appears, this condition is a 
sufficient condition as well. This fact is known as Silvester's criterion. 

Theorem 4.6 (Silvester). A quadratic form g in a Gnitc-dimcnsional space 
V is positive if and only if all principal minors of its matrix are positive. 

PROOF. The positivity of g implies the positivity of all principal minors in its 
matrix. This fact is already proved. Let's prove the converse proposition. Suppose 
that all diagonal minors (4.2) in the matrix of a quadratic form g are positive. We 
should prove that g is positive. The proof is by induction on n = dim V . 

The basis of the induction in the case dimV = 1 is obvious. Here the matrix 
of g consists of the only element gn that coincides with the only principal minor: 
gn = Mi. The value g(v) in one-dimensional space is determined by the only 
coordinate of a vector v according to the formula g(\r) = <?n (v 1 ) 2 . Therefore 
Mi > implies the positivity of the form g. 



det g = det S' 



(det Q) det 5= (dots') 2 . 



Uk = (ei, . . . , e fe ). 
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Suppose that the proposition we are going to prove is valid for a quadratic form 
in any space of the dimension less than n = dim V. Let g%j be the matrix of our 
quadratic form g in some basis ei, ... , e n of V. Let's denote 



U=(e u 



e„-i). 



Denote by h the restriction of the form g to the subspace U of the dimension n — 1. 
The matrix elements hij in the matrix of h calculated in the basis e l7 ... , e„_i 
coincide with corresponding elements in the matrix of the initial form: = g^ . 
Therefore, the minors Mi,... , M„_i can be calculated by means of the matrix 
h^. Due to the positivity of these minors, applying the inductive hypothesis, we 
find that h is a positive quadratic form in U. 

Let ei, ... , e„_i be a basis in which the matrix of the form h has the canonic 
form (3.11). Applying the theorem 4.1 to the form h, we conclude that the matrix 
in the canonic basis §i, ... , e„_i is the unit matrix. Let's complete the basis 
§i, ... , e n _i of the subspace U by the vector e„ ^ U. As a result we get the basis 
§i, ... , e n _i, e„ in which the matrix of g has the form 



Qi = 



l 

o 

9nl 



The passage from the basis ei, . . 







Sir. 



1 9n-ln 
gnn—1 gnn 



by a blockwise-diagonal matrix S of the form 

Si ... .S, 1 , 

Si 

s^ 1 ... s^- 1 

... 1 



(4.3) 



, e„ to the basis §i, ... , e„_i, e„ is described 



(4.4) 



The formula (1.12) relates the matrix (4.3) with the matrix Q of the quadratic 
form g in the initial basis: Q\ = S tr Q S. From this formula we derive 



dct Q x = det g (det Sf = M n (det Sf 



(4.5) 



Due to the above formula (4.5) the positivity of the principal minor M„ = det<? 
in the initial matrix (4.1) implies the positivity of the determinant of the matrix 
(4.3), i.e. det Si > 0. 

Let's calculate the determinant of the matrix (4.3) explicitly. For this purpose 
we multiply the first column of this matrix by g\ n and subtract it from the last 
column. Then we multiply the second column by gi n and subtract it from the last 
one. We produce such an operation repeatedly for each of the first n—1 columns 
of the matrix (4.3). From the course of algebra we know that such transformations 
do not change the determinant of a matrix. In present case they simplify the 
matrix (4.3) bringing it to a lower-triangular form. Therefore, wc can calculate 
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the determinant of the matrix (4.3) in explicit form: 

1 ... 
det Qi = det ' ^ Q 

Qnl • • • <?nn— 1 Qnn 

The element g nn in the transformed matrix is given by the formula 

7i—l n— 1 



(4.6) 



Qnn — Qnn ^ ^ <?m <?m — Qnn ^ ^ ) ■ 



(4.7) 



i=i 



i=i 



The matrix of the quadratic form g in the basis §i, ... , e n _i, e„ is close to the 
diagonal matrix. Let's complete the process of diagonalization replacing the vector 
e„ by the vector e n ^U such that 

n-l 

&n &n ^ ^ 9in ' &i • 

i=l 

The passage from §i, ... , e„_i,e„ to §i, ... , e„ changes only the last basis 
vector. Therefore, the unit diagonal block in the matrix (4.3) remains unchanged. 
For non-diagonal elements g(e k , e„) in the new basis we have 

n— 1 n— 1 

g(e k , e„) = g kn - ^ g(e k , e { ) = g kn - ^ g in h ki = 0. 



The equality g(e k ,e n ) — in the above formula is due to the fact that the matrix 
of the restricted form h in its canonic basis §i, ... , e„_i is the unit matrix. For 
the diagonal element g(e n ,e n ) from this fact we derive 



n-l n — 1 



n-l 



9(e n , e n ) = g nn -^^ 5ra g kn h lk = g nn ]>>„) 2 . 



i=l k=l 



Comparing this expression with (4.7), we find that g(e n ,e n ) — gnn- Thus, the 
matrix of g in the basis ei, . . . , e„ is a diagonal matrix of the form 



1 ... 
... 1 

... Q nn 



(4.8) 



Combining (4.5) and (4.6), for the element g nn in (4.8) we get g nn = M n (detS) 2 . 
Since the principal minor M n of the initial matrix (4.1) is positive, we find that 
g nn in (4.8) is also positive. Hence, g is a positive quadratic form. Thus, we have 
completed the inductive step and have proved the theorem in whole. □ 



CHAPTER V 
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§ 1. The norm and the scalar product. The angle 
between vectors. Orthonormal bases. 

Definition 1.1. A Euclidean vector space is a linear vector space V over the 
field of reals R which is equipped with some fixed positive quadratic form g. 

Let (V, g) be a Euclidean vector space. There many positive quadratic forms 
in the linear vector space V, however, only one of them is associated with V so 
that it defines the structure of Euclidean space in V. Two Euclidean vector spaces 
(V, <7i) and (V, #2) with gi 7^ 92 coincide as linear vector spaces, but they are 
different when considered as Euclidean vector spaces. 

The structure of the Euclidean vector space (V, g) is associated with a special 
terminology and special notations. The value of the quadratic form g(y) is non- 
negative. The square root of g(v) is called the norm or the length of a vector v. 
The norm of a vector v is denoted as follows: 

|v| = VgM- (i.i) 

The quadratic form g(v) produces the bilinear form g(v,w) determined by the 
recovery formula (1.6) of Chapter IV. The value of that bilinear form is called the 
scalar product of two vectors v and w. The scalar product is denoted as follows: 

(v|w)=ff(v,w). (1.2) 

Due to the notation (1.1) and (1.2), when dealing with some fixed Euclidean space 
(V, g) , we can omit the symbol g at all. 

The scalar product (1.2) is defined for a pair of two vectors v,w g V. It is 
quite different from the scalar product (1.8) of Chapter III, which is defined for a 
pair of a vector and a covector. The scalar product (1.2) of a Euclidean vector 
space possesses the following properties: 

(1) (vi + V2 I w) = (vi I w) + (V2 I w) for all vi,V2,wgV; 

(2) (a ■ v I w) = a (v | w) for all v, w g V and for all a G R; 

(3) (v I wi + w 2 ) = (v I wi) + (v I w 2 ) for all wi,w 2 ,veV r ; 

(4) (v I a ■ w) = a (v | w) for all v, w g V and for all a g R; 

(5) (v I w) = (w I v) for all v, w g V; 

(6) |v| 2 = (v I v) > for all v g V and |v| = implies v = 0. 

The properties (l)-(4) reflect the bilincarity of the form g in (1.2). They are 
analogous to that of the scalar product of a vector and a covector (see formulas 
(1.9) in Chapter III). 
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The properties (5) and (6) have no such analogs. But they are the very 
properties that make the scalar product (1.2) a generalization of the scalar 
product of 3-dimensional geometric vectors. 

Theorem 1.1. The following two additional properties of the scalar product 
(1.2) are derived from the properties (l)-(6): 

(7) |(v,w)| < |v| |w| for all v,weF; 

(8) |v + w| < jvj + |w| forallv,weV. 

The property (7) is known as the Cauchy-Bunyakovsky-Schwarz inequality, while 
the property (8) is called the triangle inequality. 

PROOF. In order to prove the inequality (7) we choose two arbitrary nonzero 
vectors v,w e V and consider the numeric function f(a) of a numeric argument 
a defined by the following explicit formula: 

f(a) = |v + a- w| 2 . (1.3) 

Using the properties (l)-(6) we find that f(a) is a polynomial of degree two: 

f(a) = |v + a ■ w| 2 = (v + a • w | v + a ■ w) = 

= (v | v) + 2 a (v | w) + a 2 (w | w) . 

The function (1.3) has a lower bound: f(a) ^ 0. This follows from the property 
(6). Let's calculate the minimum point of the function f(a) by equating its 
derivative f'(a) to zero. This yields the following equation: 

f'(a) = 2 (v | w) + 2 a (w | w) = 0. 

Solving this equation, we find a m ; n = — (v | w)/(w | w). Now let's write the 
condition f(a) ^ for the minimal value of the function f(ct): 

Ivl 2 Iwl 2 - (v| w) 2 

/min - /(a min ) = 11 ' |2 l ' ' > 0. (1.4) 

l w l 

The denominator of the fraction (1.4) is positive, therefore, from the inequality 
(1.4) we easily derive the property (7). 

In order to prove the property (8) we consider the square of the norm for the 
vector v + w. For this quantity we derive 

|v + w| 2 = (v + w | v + w) = |v| 2 + 2 (v I w) + |w| 2 . (1.5) 

Applying the property (8), which is already proved, for the right hand side of the 
equality (1.5) we get the following estimate: 

|v| 2 + 2 (v | w) + |w| 2 |v| 2 + 2 |v| |w| + |w| 2 = (|v| + |w|) 2 . 

From the relationship (1.5) and from the above inequality we derive the other 
inequality |v + w| 2 < (|v| + |w|) 2 . Now the property (8) is derived by taking the 
square root of both sides of this inequality. This operation is correct since y = -Jx 
is an increasing function of the real semiaxis [0, +oo). The theorem is proved. □ 
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Due to the analogy of (1.2) and the scalar product of geometric vectors and due 
to the Cauchy-Bunyakovsky-Schwarz inequality |(v, w)| < |v| |w| we can introduce 
the concept of an angle between vectors in a Euclidean vector space. 

Definition 1.2. The number ip from the interval ^ ip ^ tt, which is deter- 
mined by the following implicit formula 

cos(^) = M^, (1.6) 



is called the angle between two nonzero vectors v and w in a Euclidean space V. 

Due to the property (7) from the theorem 1.1 the modulus of the fraction in left 
hand side of (1.6) is not greater than 1. Therefore, the formula (1.6) is correct. It 
determines the unique number <p from the specified interval < <p ^ tt. 

Definition 1.3. Two vectors v and w in a Euclidean space V are called 
orthogonal vectors if they form a right angle (ip = n/2). 

The definition 1.3 applies only to nonzero vectors v and w. The definition 2.1 
of Chapter IV is more general. Let's reformulate for the case of Euclidean spaces. 

Definition 1.4. Two vectors v and w in a Euclidean space V are called 
orthogonal vectors if their scalar product is zero: (v | w) = 0. 

For nonzero vectors v and w these two definitions 1.3 and 1.4 are equivalent. 
Let vi, ... , v m be a system of vectors in a Euclidean space (V,g). The matrix 
gij composed by the mutual scalar products of these vectors 

9ij = (vi|vj), (1-7) 

is called the Gram matrix of the system of vectors Vi, ... , v TO . 

Theorem 1.2. A system of vectors vi, ... , v m in a Euclidean space is linearly 
dependent if and only if the determinant of their Gram matrix is equal to zero. 

Proof. Suppose that the vectors vi, ... , v m are linearly dependent. Then 
there is a nontrivial linear combination of these vectors which is equal to zero: 

d\ ■ vi + . . . + a m ■ v m = 0. (1.8) 

Using the coefficients of the linear combination (1.8), we construct the following 
expression with the components of Gram matrix (1.7): 

m m 

^ 9ij osj = y^(v» | vj) ctj = (vj | ai • vi + . . . + a m ■ v m ) = (vj | 0) = 0. 

Since i is a free index running over the interval of integer numbers from 1 to m, 
this formula means that the columns of Gram matrix are linearly dependent. 
Hence, its determinant is equal to zero (this fact is known from the course of 
general algebra). 
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Conversely, assume that the determinant of the Gram matrix (1.7) is equal to 
zero. Then the columns of this matrix are linearly dependent and, hence, there is 
a nontrivial linear combination of them that is equal to zero: 

m 

Let's denote v = a\ ■ Vi + . . . + ot m ■ v TO . Then consider the following double sum, 
which is obviously equal to zero due to the equality (1.9): 

mm m 

= ^2 E ai gi i a j = ^2 a ii' v l \ai-v 1 + ... + a m - v m ) = 

i—l j — 1 i— 1 

= (ai • vi + . . . + a m ■ v m | v) = (v | v) = |v| 2 . 

Thus, we get |v| 2 = and, using the positivity of the basic quadratic form g 
of the Euclidean space V, we derive v = 0. Since v = 0, we get the nontrivial 
linear combination of the form (1.8), which is equal to zero. Hence, our vector 
Vi, ... , v TO are linearly dependent. □ 

Let ei, ... , e„ be a basis in a finite-dimensional Euclidean vector space (V, g). 
Let's consider the Gram matrix of this basis. Knowing the components of the 
Gram matrix, we can calculate the norm of vectors (1.1) and the scalar product of 
vectors (1.2) through their coordinates: 

n n n n 

i v i 2 = EE^ vV ' (v!w) = ^E^ ,; ' : " J - ( L1 °) 

i— 1 j — 1 i—1 j — 1 

A basis ei , . . . , e n in a Euclidean space V is called an orthonormal basis if the 
Gram matrix for the basis vectors is the unit matrix: 

f 1 for t = i 

s " = \o f.,Mi (U1) 

If the condition (1.11) is not fulfilled, then the basis ei, . . . , e„ is called a skew- 
angular basis. In an orthonormal basis the vectors ei, ... , e„ are unit vectors 
orthogonal to each other. This simplifies the formulas (1.10) substantially: 



|v| 2 =]>>*) 2 , (v|w)=^>W. (1.12) 

i=l i=l 

Orthonormal bases do exist. Due to (1.2) and (1.7) we know that the Gram 
matrix of the basis vectors ei, ... , e„ is the matrix of the quadratic g in this 
basis. The theorem 3.3 of Chapter IV says that there exists a basis in which the 
matrix of g has its canonic form (sec (3.11) in Chapter IV). Since g is a positive 
quadratic form, its matrix in a canonic form is the unit matrix (see theorem 4.1 in 
Chapter IV). 

The theorem 4.8 on completing the basis of a subspace formulated in Chapter I 
has its analog for orthonormal bases. 
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Theorem 1.3. Let ei, ... , e s be an orthonormal basis in a subspace U of 
a Unite-dimensional Euclidean space (V,g). Then it can be completed up to an 
orthonormal basis ei, ... , e n in V. 

PROOF. Let's consider the orthogonal complement U± of the subspace U. 
According to the theorem 4.4 of Chapter IV, the subspaces U and U ± define the 
expansion of the space V into a direct sum: 

V = U®U ± . 

The subspace U± inherits the structure of a Euclidean space from V. Let's choose 
an orthonormal basis e s +i, . . . , e„ in U± and then join together two bases of 
U and U±. As a result we get the basis in V (see theorem 6.3 of Chapter I). 
The vectors of this basis are unit vectors by their length and they are orthogonal 
to each other. Hence, this is an orthonormal basis completing the initial basis 
ei, . . . , e s of the subspace U. □ 

Let ei, ... , e s and §i, ... ,e s be two orthonormal bases and let S be the 
transition. The Gram matrices of these two bases are the unit matrices. Therefore, 
applying the formulas (1.12) of Chapter IV, for the transition matrix S we derive 

S tT S = l, S- 1 = S tI . (1.13) 

Note that a square matrix S satisfying the above relationships (1.13) is called an 
orthogonal matrix. 

From the relationships (1.13) for the determinant of an orthogonal matrix we 
get: (det5) 2 = 1. Therefore, orthogonal matrices are subdivided into two types: 
matrices with positive determinant det S = 1 and those with negative determinant 
dct S = — 1. This subdivision is related to the concept of orientation. All bases in 
a linear vector space over the field of real numbers K (not necessarily a Euclidean 
space) can be subdivided into two sets which can be called «left bases» and «right 
bases». The transition matrix for passing from a left basis to a left basis or 
for passing from a right basis to another right basis is a matrix with positive 
determinant — it does not change the orientation. The transition matrix for 
passing from a left basis to a right basis or, conversely, from a right basis to a 
left basis is a matrix with negative determinant. Such a transition matrix changes 
the orientation of a basis. We say that a linear vector space V over the field of 
real numbers K is equipped with the orientation if there is some mechanism to 
distinguish one of two types of bases in V. 

§ 2. Quadratic forms in a Euclidean space. 
Diagonalization of a pair of quadratic forms. 

Let (V, g) be a Euclidean vector space and let ip be a quadratic form in V. For 
such a form tp we define the following ratio: 

M(v) = ^ (2.D 

The number ri(v) in (2.1) is a real non-negative number. Note that ri(a-v) = ri(v) 
for any nonzero a e R. Therefore, we can assume v in (2.1) to be a unit vector. 
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Let's denote by \\ip\\ the least upper bound of (i(v) for all unit vectors (such 
vectors sweep out the unit sphere in the Euclidean space V): 

|M| = sup M (v). (2.2) 
M=i 

Definition 2.1. The quantity ||<^|| determined by the formulas (2.1) and (2.2) 
is called the norm of a quadratic form tp in a Euclidean vector space V . If the 
norm \\tp\\ is finite, the form tp is said to be a restricted quadratic form. 

Theorem 2.1. If tp is a restricted quadratic form, then there is the estimate 
\tp(v, w)| ^ \\ip\\ |v| |w| for the values of corresponding symmetric bilinear form. 

PROOF. In order to calculate tp(v 7 w) we use the following equality, which, in 
essential, is a version of the recovery formula: 

4 a tp(v, w) = ip(\ + a ■ w) — tp{\ — a ■ w). (2.3) 

From (2.3) we derive the following inequality for the quantity 4 a tp(v, w): 

4a^(v,w) < \ip(v + a ■ w)| + |(^(v — a ■ w)|. (2.4) 

Now let's apply the inequality |v?(u)| < \\tp\\ |u| 2 derived from (2.1) and (2.2) in 
order to estimate the right hand side of (2.4). This yields 

4a^(v,w) ||^||(|v + a-w| 2 + |(v-a-w)| 2 ). (2.5) 

Let's express the squares of moduli through the scalar products: 

|v ± a ■ w| 2 = |v| 2 ± 2a (v | w) + a 2 |w| 2 . 

Then we can simplify the inequality (2.5) bringing it to the following one: 

4a^(v,w) < 2\\<p\\ (|v| 2 + a 2 |w| 2 ). 

Now let's transform the above inequality a little bit more: 

f(a) = a 2 |M| |w| 2 - 2c^(v,w) + |M| |v| 2 ^ 0. 

The numeric function f(a) of a numeric argument a is a polynomial of degree 
two in a. Let's find the minimum point a = a m j n for this function by equating its 
derivative to zero: ,f'{a) = 0. As a result we obtain 

_ l(v,w) 

amin " MM 2 ' 

Now let's write the inequality /(a m in) ^ for the minimal value of this function. 
This yields the following inequality for the bilinear form tp: 

^(v,w) 2 5:|M| 2 |v| 2 |w| 2 . 
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Now it is easy to derive the required estimate for |y>(v, w)| by taking the square 
root of both sides of the above inequality. Note that a quite similar method 
was used when proving the Cauchy-Bunyakovsky-Schwarz inequality in the theo- 
rem 1.1. □ 

Theorem 2.2. Any quadratic form ip in a finite-dimensional Euclidean vector 
space V is a restricted form. 

Proof. Let's choose an orthonormal basis ei, . . . , e n in V and consider the 
expansion of a unit vector v in this basis. For the coordinates of v in this basis 
due to the formulas (1.12) we obtain 

{v 1 ) 2 + . . . + {v n ) 2 = 1. 

Hence, for the components of v we have \v l \ ^ 1. Let's express the quantity /x(v), 
which is defined by formula (2.1), through the coordinates of v: 



fi(v) = \<p(v)\ = ^^Wjt;'i 
»=i j=i 

From \v l \ < 1 we derive the following estimate for the quantity //(v): 

n n 

/*(«)<EEi^i <o °' ( 2 - 6 ) 

i=l j=l 

Right hand site of (2.6) does not depend v. Due to (2.2) this sum is an upper 
bound for the norm ||<p||. Hence, \\<p\\ < oo. The theorem is proved. □ 

Theorem 2.3. For any quadratic form ip in a finite-dimensional Euclidean vec- 
tor space V the supremum in formula (2.2) is reached, i. e. there exists a vector 
v^O such that \<p{\)\ = \\ip\\ |v| 2 . 

Proof. From the course of mathematical analysis we knows that the supremum 
of a numeric set is the limit of some converging sequence of numbers of this set 
(see [6]). This means that there is a sequence of unit vectors v(l), ... , v(n), . . . 
in V such that the norm \\ip\\ is expressed as the following limit: 

\\cp\\ = lim \<p(v{s))\. (2.7) 

s^oc 

Let's choose an orthonormal basis ei, ... , e„ in v and let's expand each vector 
v(s) of the sequence in this basis. The equality 

(v\s)) 2 + ... + (v n (s)) 2 = l (2.8) 

is derived from \v(s)\ = 1 due to the formulas (1.12). Now the equality (2.8) means 
that each specific coordinate v l (s) yields a restricted sequence of real numbers: 

-1 < v l (s) < 1. 

From the course of mathematical analysis we know that in each restricted sequence 
of real numbers one can choose a converging subsequence. So, in the sequence 
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of unit vectors v(s) one can choose a subsequence of unit vectors whose first 
coordinates form a convergent sequence of numbers. Let's denote this subsequence 
again by v(s) and choose its subsequence with converging second coordinates. 
Repeating this choice n-times for each specific coordinate, we get a subsequence of 
unit vectors v(sfe) such that their coordinates all are the converging sequences of 
numbers. Let's consider the limits of these sequences: 



v l = lim v l (sk)- 



(2.9) 



Denote by v the vector whose coordinates are determined by the limit values (2.9). 
Passing to the limit s — > oo in (2.8), we conclude that v is a unit vector: |v| = 1. 

Now let's calculate \p(v)\ using the matrix of the quadratic form p and the 
coordinates of v in the basis ei, ... , e„: 



b(v)| = 



2 p« vl 



=i i=i 



lim 

k^oo 



5^5ZyijW < («fc)w , '(*fc) 



= 1 J=l 



On the other hand, taking into account (2.7), for |v(v)| we get 
\p(v)\ = lim \<p(v{s k ))\ = lim \p(v(s))\ = \\<p\\. 

k—*oo s— >-oo 



(2.10) 



Thus, for the unit vector v with coordinates (2.9) we get \<p(v)\ = \\ip\\. Multiplying 
v by some number a € K, we can remove the restriction |v| = 1. Then the equality 
(2.10) will be written as |v?( v )l = ||v|| l v | 2 - The theorem is proved. □ 

Theorem 2.4. For any quadratic form ip in a finite- dimensional Euclidean vec- 
tor space (V,g) there is an orthonormal basis ei, ... , e„ such that the matrix of 
the form p in this basis is a diagonal matrix. 

PROOF. The proof is by induction on the dimension of the space V. In the case 
dim V = 1 the proposition of the theorem is obvious: any square lxl matrix is a 
diagonal matrix. 

Suppose that the proposition of the theorem is valid for all quadratic forms in 
Euclidean spaces of the dimension less than n. Let dim V = n and let <p be a 
quadratic form in the Euclidean space (V, g). Applying theorems 2.2 and 2.3, we 
find a unit vector v e V such that |y(v)| = \\ip\\. For the sake of certainty we 
assume that (f(v) ^ 0. Then we can remove the modulus sign: ip(v) = \\<p\\. In 
the case <p(v) < we replace the form by the opposite form (p = —p since two 
opposite forms diagonalize simultaneously. 

Let's denote U — (v) and consider the orthogonal complement U ± . The 
subspaccs U — (v) and U ± have zero intersection, their sum is a direct sum and 
U U ± — V (see theorem 4.4 in Chapter IV). Let's take an arbitrary vector 
w e U ± of the unit length and compose the vector u as follows: 

u = cos(a) • v + sin(a) • w. 



Here a is a numeric parameter. It is easy to see that u is also a unit vector, this 
follows from the identity cos 2 (a) + sin 2 (a) = 1. 
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Let's calculate the value of the quadratic form ip on the vector u and treat it as 
a function of the numeric parameter a: 

f(a) = <p(u) = cos 2 (a) <p(v) + 2 sin(a) cos(a) <p(v, w) + sin 2 (a) <p(w). 

According to the choice of the vector v, we have the estimate ip(u) ^ y(v), and 
for a = 0, i. e. when u — v, we have the equality ip(u) = <p(v). Hence, a = is a 
maximum point for the function /(a). Let's calculate its derivative at the point 
a = and equate it to zero. This yields 

/'(<)) = 2 p(v,w) = 0. (2-11) 

Hence, </?(v, w) = for all vectors w e Let's apply the inductive hypothesis 
to the subspace U± whose dimension is less by 1 than the dimension of the space 
V. Therefore, we can find an orthonormal basis ei, ... , e n _i in the subspace 
U± such that the matrix of the form ip is diagonal in this basis: ip(ei,ej) = 
for i 7^ j. Let's complete the basis e l7 . . . , e„_i with the vector e n = v. The 
complementary vector e„ is a vector of unit length. It is orthogonal to the vectors 
ei, ... , e„_i. Therefore, the basis e l7 . . . , e„ is an orthonormal basis in V. The 
matrix of the form ip is diagonal in the basis ei , ... , e„ . This fact is immediate 
from (2.11). The theorem is proved. □ 

The theorem 2.4 is known as the theorem on simultaneous diagonalization of 
a pair quadratic form ip and g. For this purpose one of them should be positive. 
Then the positive form g defines the structure of a Euclidean space in V and then 
one can apply the theorem 2.4. Orthonormality of the basis ei, . . . , e„ means 
that the matrix of g is diagonal in this basis (it is the unit matrix). The matrix of 
ip is also diagonal as stated in the theorem 2.4. 

§ 3. Selfadjoint operators. The theorem on the spectrum 
and the basis of eigenvectors for a selfadjoint operator. 

Definition 3.1. A linear operator / : V — > V in a Euclidean vector space 
V is called a symmetric operator or a selfadjoint operator if for any two vectors 
v, w G V the following equality is fulfilled: (v | /(w)) = (/(v) | w). 

Definition 3.2. A linear operator h: V — > V in a Euclidean vector space V 
is called an adjoint operator to the operator / : V — > V if for any two vectors 
v,w e V the following equality is fulfilled: (v|/(w)) = (/i(v) |w). The adjoint 
operator is denoted as follows: h = f + . 

In S 4 of Chapter III we have introduced the concept of conjugate mapping. 
There we have shown that any linear mapping / : V — > W possesses the conjugate 
mapping /* : W* — > V*. For a linear operator /: V — > V the conjugate mapping 
/* is a linear operator in dual space V* . It is related to / by means of the equality 

</»|v) = <u|/(v)>, (3.1) 



which is fulfilled for all u <E V* and for all v <G V. 
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The structure of a Euclidean vector space in V is determined by a positive 
quadratic form g. Like every quadratic form, the form g possesses the associated 
mapping a g : V — > V* (sec § 2 in Chapter III) such that 

(a fl (v)|w)= 5 (v,w) = (v|w). (3.2) 

In the case of finite-dimensional space V and positive form g the associated 
mapping a g . is bijective. Therefore, for any linear operator / : V — > V we can 



define the composition h = a 1 ° /* °a g . Then from (3.1) and (3.2) we derive 



(h(v) | w) = (a g o h(v) | w) = 

= (/*oa s (v)|w) = (a fl (v)|/(w)) = (v|/(w)). 



Comparing (3.3) with the definition 3.2, we can formulate the following theorem. 

Theorem 3.1. For any operator f in a finite-dimensional Euclidean space ( V, g) 
there is the unique adjoint operator /+ =a s 1 » /* ° a g . 

PROOF. The existence of an adjoint operator is already derived from the 
formula / + = a^ 1 ° /* ° a g and the equality (3.3). Let's prove its uniqueness. 
Assume that h is another operator satisfying the definition 3.2. Then for the 
difference r — h — /+ we derive the relationship 

(r(v)|w) = (Mv)|w)-(/ + (v)|w) = 

= (v|/(w))-(v|/(w))=0. ( ' > 

Since w in (3.4) is an arbitrary vector, we conclude that r(v) £ Kert;. However, 
Ker<7 = {0} for a positive quadratic form g, hence, h(v) = for any veF. This 
means that h = 0. Thus, we have proved that the adjoint operator /+ for / is 
unique. This completes the proof of the theorem. □ 

COROLLARY. The passage from f to /+ is an operator in the space of endomor- 
phisms End(V^) of a finite-dimensional Euclidean vector space (V, g). This operator 
possesses the following properties: 

(f + h) + = f + + h + , («-/) + =«-/ + , 

(fch) + = h + *r, (n + = f. 



Relying upon the existence and the uniqueness of of the adjoint operator /+ for 
any operator / G End(^), we can derive all the above relationships immediately 
from the definition 3.2. The relationship /+ = a^ 1 ° f* ° a g can be expressed in 
the form of the following commutative diagram: 



V - 1 — ► V 




v* > V* 

f 
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Comparing the definitions 3.1 and 3.2, now we see that a self adjoint operator / is 
an operator which is adjoint to itself: /+ = /. 

Let 6i, ... , e n be a basis in a finite-dimensional Euclidean space (V, g) and let 
h 1 , . . . , h n be the corresponding dual basis composed by coordinate functionals. 
For any vector v E V we have the following expansion, which follows from the 
definition of coordinate functionals (see § 1 in Chapter III) : 

v = / l 1 (v)-e 1 + ... + /i"(v)-e„. 

Let's apply this expansion in order to calculate the matrix of the associated 
mapping a g . For this purpose we need to apply a g one by one to all basis vectors 
ei, ... , e„ and expand the results in the dual basis in V*. Let's consider the 
value of the functional a g (e,) on an arbitrary vector v of the space V: 

a 9 (e*)(v) = (a 3 (e 4 )|v) = g(e h v) = 

n 

= g(ei, h\v) . ei + . . . + h n (v) • e„) = Y,9ij h j (v). 

3 = 1 

Since v £ V is an arbitrary vector, we conclude that the matrix of the associated 
mapping a g in two bases ei, ... , e„ and h , ... , h n coincides with the matrix 
9ij = g{ e i, e j) °f the quadratic form g in the basis e 1; . . . , e„ The matrix g^ is 
non-degenerate (see theorem 1.2 or Silvester's criterion in §4 of Chapter IV). Let's 
denote by g lJ the components of the matrix inverse to gij. The matrix g lJ is the 
matrix of the inverse mapping a" 1 , i. e. we have: 



n n 

a g ( ei ) =J29ij h j , a g \K) =^9 ij (3.5) 

j=i o=i 

The matrix inverse to a symmetric matrix is again a symmetric matrix (this fact 
is well-known from general algebra). Therefore g^ = g jl . 

Remember that we have already calculated the matrix of the conjugate mapping 
/* (see formula (4.2) and theorem 4.3 in Chapter III). When applied to our present 
case the results of Chapter III mean that the matrix of the operator /* : V* — > V* 
in the basis of coordinate functionals h 1 ,.. . ,h n coincides with the matrix of the 
initial operator / in the basis ei, ... , e„. Let's combine this fact with (3.5) and 
let's use the formula /+ = a^o/'oa, from the theorem 3.1. Then for the matrix 
of F + of the adjoint operator / + we obtain: 

n=EE^f?^- (3-6) 

fc=l q=l 

In matrix form the formula (3.6) is written as F + = G^ 1 F tr G, where G is the 
Gram matrix of that basis in which the matrices of / and /+ are calculated. 
The formula (3.6) simplifies substantially for orthonormal bases. Here the passage 
to the adjoint operator means the transposition of its matrix. The matrix of 
a selfadjoint operator in an orthonormal basis is symmetric. For this reason 
sclfadjoint operators are often called symmetric operators. 
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Let / : V — > V be a selfadjoint operator in a Euclidean space V. Each such 
operator produces the quadratic <Pj according to the formula 

^(v) = (v|/(v)). (3.7) 

Conversely, assume that we have a quadratic form p in a finite-dimensional 
Euclidean space (V, g). The form ip determines the associated mapping a v (see 
definition 2.5 in Chapter IV). This mapping satisfies the relationship 

((V(v)|w)=p(v,w) (3.8) 

for any two vectors v, w G V . The positive quadratic form g defining the structure 
of Euclidean space in V has also its own associated mapping a g . The mapping a g 
is bijective since g is non-degenerate (see theorem 4.2 in Chapter IV). Therefore, 
we can consider the composition of a~ 1 and a v : 

U = a^ 1 o a v . (3.9) 

This composition (3.9) is an operator in V. It is called the associated operator 
of the form p in a Euclidean space. Since a g is bijective, we can write (3.2) as 
(u | w) = (a~ 1 (u) | w). Combining this equality with (3.8), we find 

0V(v) | w) = (a^Mv)) | w) = (a„(v) | w) = <p(v, w). (3.10) 

Now, using the symmetry of the form p(v, w) in (3.10), we write 

(f v (v) | w) = V (v,w) - p(w,v) - (U(w)\ v) = ( V |/ V (w)). (3.11) 

The relationship (3.11), which is an identity for all v,w G V, means that f v is a 
selfadjoint operator (see definition 2.1). 

The formula (3.7) associates each selfadjoint operator / with the quadratic form 
ipj , while the formula (3.9) associates each quadratic form ip with the selfadjoint 
operator f v . These two associations are one to one and are inverse to each other. 
Indeed, let's apply the formula (3.7) to the operator (3.9) and use (3.10): 

V f (v) = (v | f v (v)) = ip(v, v) = <p(v). 

Now, conversely, let's construct the operator h = f v for the quadratic form ip = tp^. 
For the operator h and for two arbitrary vectors v,w e V from (3.10) we derive 

(ft(v) | w) = Vf (v, w) = (v | /(w) = (/(v) | w). 

Since w S V is an arbitrary vector and since the form g determining the scalar 
product in V is non-degenerate, from the above equality we get h(v) = /(v). 

Thus, from what was said above we conclude that defining a selfadjoint operator 
in a finite-dimensional Euclidean space is equivalent to defining a quadratic form 
in this space. Therefore, we can apply the theorem 2.4 for describing selfadjoint 
operators in a finite-dimensional case. 
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Theorem 3.2. All eigenvalues of a selfadjoint operator f in a finite-dimensional 
Euclidean space V are real numbers and there is an orthonormal basis composed 
by eigenvectors of such operator. 

Proof. For the selfadjoint operator / in V we consider the symmetric bilinear 
form <fif(v, w) determined by the quadratic form (3.7). Let ei, ... , e n be an 
orthonormal basis in which the matrix of the form tp^ is diagonal. Then from the 
formula (3.7) we derive the following equalities: 

n 

<p f ( ei) ej) = (e t \ f(e 3 ) = J2 F j 9ik = K- (3-12) 

fc=i 

As we see in (3.12), the matrices of the operator / and of the form ip in such 
basis do coincide. This proves the proposition of the theorem. □ 

The theorem 3.2 is known as the theorem on the spectrum and the basis of 
eigenvectors of a selfadjoint operator. The main result of this theorem is the 
diaginalizability of selfadjoint operators in a finite-dimensional Euclidean space. 
The characteristic polynomial of a selfadjoint operator is factorized into the 
product of linear terms in R. Its eigenspaces coincide with the corresponding root 
subspaces, the sum of all its eigenspaces coincides with the space V: 

V = V Xl ®...®V Xs . (3.13) 

Theorem 3.3. Any two eigenvectors of a selfadjoint operator corresponding to 
different eigenvalues are orthogonal to each other. 

PROOF. Let / be a selfadjoint operator in a Euclidean space and let A ^ fi be 
its eigenvalues. Let's consider the corresponding eigenvectors a and b: 

/(a) = A • a, /(b) = M • b. 

Then for these two eigenvectors a and b we derive: 

A(a|b) = (/(a)|b) = (a|/(b))= M (a|b). 

Hence, (A — fj,) (a | b) = 0. But we know that A — n ^ 0. Therefore, (a | b) = 0. 
The theorem is proved. □ 

Assume that the kernel of selfadjoint operator / is nontrivial: Ker/ =/= {0}. 
Then Ai = in (3.13) is one of the eigenvalues of the operator / and we have 

Ker/ = y Al , lmf = V X2 ®...®V Xs . 

This means that the kernel and the image of a selfadjoint operator are orthogonal 
to each other and their sum coincides with V: 



V = Ker/ elm/. 



(3.14) 
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§ 4. Isometries and orthogonal operators. 

Definition 4.1. A linear mapping / : V — > W from one Euclidean vector 
space (V, g) to another Euclidean vector space (W, h) is called an isometry if 

(/(x)|/(y)) = (x|y) (4.1) 

for all x, y g V", i. e. if it preserves the scalar product of vectors. 

From (4.1) we easily derive |/(x)| = |x|, therefore, /(x) = implies |x| = and 
x = 0. This means that the kernel of an isometry is always trivial Ker / = {0}, i. e. 
any isometry is an injective mapping. Due to the recovery formula for quadratic 
forms (see formula (1.6) in Chapter IV) in order to verify that / : V — > W is 
an isometry it is sufficient to verify that it preserves the norm of vectors, i.e. 
|/(x)| = |x| for all vectors xef. 

Theorem 4.1. The composition of isometries is again an isometry. 

PROOF. Assume that the mappings h : U — > V and / : V — > W both arc 
isometries. Hence, \h(u)\ = |u| for all u g U and |/(v)| = |v| for all veK Then 

\f°h(u)\ = \f(h(u))\ = \h(u)\ = \u\ 

for all u g U. This equality means that the mapping f °h is an isometry. The 
theorem is proved. □ 

Definition 4.2. A bijective isometry /: V — > W is called an isomorphism of 
Euclidean vector spaces. 

Theorem 4.2. Isomorphisms of Euclidean vector spaces possess the following 
three properties: 

(1) the identical mapping idy is an isomorphism; 

(2) the composition of isomorphisms is an isomorphism; 

(3) the mapping inverse to an isomorphism is an isomorphism. 

The proof of this theorem is very easy if we use the above theorem 4.1 and the 
theorem 8.1 of Chapter I. 

Definition 4.3. Two Euclidean vector spaces V and W are called isomorphic 
if there is an isomorphism / : V — > W relating them. 

Let's consider the arithmetic vector space 1" composed by column vectors of 
the height n. The addition of such vectors and the multiplication of them by 
real numbers are performed as the operations with their components (see formulas 
(2.1) in Chapter I). Let's define a quadratic form g(x) in R n by setting 

n 

g(x) = (x 1 ) 2 + ... + (x n ) 2 ='£(x i ) 2 - (4-2) 

»=i 

The form (4.2) yields the standard scalar product and, hence, defines the standard 
structure of a Euclidean space in M™. 
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Theorem 4.3. Any n-dimensional Euclidean vector space V is isomorphic to 
the space K" with the standard scalar product (4.2). 

In order to prove this theorem it is sufficient to choose the orthonormal basis in 

V and consider the mapping ip that associates a vector v E V with column vector 
of its coordinates (see formula (5.4) in Chapter I). 

Definition 4.4. An operator / in a Euclidean vector space V is called an 
orthogonal operator if it is bijective and defines an isometry / : V — > V. 

Due to the theorem 4.2 the orthogonal operators form a group which is called 
the orthogonal group of a Euclidean space V and is denoted by O(V). The group 
O(V) is obviously a subgroup in the group of automorphisms Aut(V). In the case 

V = M. n the orthogonal group determined by the standard scalar product in W l is 
denoted by 0(n,R). 

Let ei, ... , e„ be an orthonormal basis in a Euclidean space V and let / be an 
orthogonal operator. Then from (4.1) we derive 

(f(e i )\f(e j )) = (e i \e j ). 

For the matrix of the operator / in the basis e 1; ... , e„ this relationship yields: 

J\ , , ( 1 for i = j, 

T, F ? F i ' = In f -J- (4 - 3) 
fri 10 for i ± j, 

When written in the matrix form, the formula (4.3) means that 

F tr F = l, F~ 1 =F tT . (4.4) 

The relationships (4.4) are identical to the relationships (1.13). Matrices that 
satisfy such relationships, as we already know, are called orthogonal matrices. As 
a corollary of this fact we can formulate the following theorem. 

Theorem 4.4. An orthogonal operator f in an orthonormal basis ei, ... , e„ 
of a Euclidean vector space V is given by an orthogonal matrix. 

As we have noted in § 1, the determinant of an orthogonal matrix can be equal 
to 1 or to — 1. The orthogonal operators in V with determinant 1 form a group 
which is called the special orthogonal group of a Euclidean vector space V. This 
group is denoted by SO(V^). If V = K™, this group is denoted by SO(n,lR). 

The operators / e SO(V r ) in two-dimensional case dim^ = 2 are most simple 
ones. If ei, e 2 is an orthonormal basis in V, then from (4.3) and detF = 1 we 
easily find the form of an orthogonal matrix F: 



cos(ip) — sin((/?) 
sin(yj) cos(ip) 



(4.5) 



A matrix F of the form (4.5) is called a matrix of two-dimensional rotation, while 
the numeric parameter ip is interpreted as the angle of rotation. 
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Let's consider orthogonal operators / e SO(V) in the case dimF = 3. Let 
ei, e2, e3 be an orthonormal basis in V. A matrix of the form 



cos(ip) — sin(iy9) 
sin((/?) cos(ip) 
1 



(4.6) 



is an orthogonal matrix with determinant 1. The operator / associated with the 
matrix (4.6) is called the operator of rotation about the vector e3 by the angle ip. 

Theorem 4.5. In a three-dimensional Euclidean vector space V any orthogonal 
operator f with determinant 1 has an eigenvalue A = 1 . 

Proof. Let's consider the characteristic polynomial of the operator /. This is 
the polynomial of degree 3 in A with real coefficients: 



P(A) 



-A 3 + Fi A 2 



F2A + F3, where P 3 = det/ = 1. 



Remember that the values of a polynomial of odd degree for large positive A and 
for large negative A differ in sign: 



lim P(A) = +00, 



lim -P(A) = —00. 



Therefore the equation of the odd degree -P(A) = with real coefficients has at 
least one real root A = Ai. This root is an eigenvalue of the operator /. 

Let ei ^ be an eigenvector of / corresponding to the eigenvalue Ai. Then, 
applying the isometry condition |v| = |/(v)| to the vector v = ei, we get 

| ei | = |/(ei)| = |Ai-ei| = |Ai||ei|. 

Hence, we find that |Ai| = 1. This means that Ai = 1 or Ai = —1. In the case 
Ai = 1 the proposition of the theorem is valid. Therefore, we consider the case 
Ai = —1. Let's separate the linear factor (A + 1) in characteristic polynomial: 

P(A) = -A 3 + Pi A 2 - P 2 A + 1 = -(A + 1)(A 2 - $1 A - 1). 

Then Pi = $1 — 1 and P2 = — 1 — $1. In order to the remaining roots of the 
polynomial -P(A) we consider the following quadratic equation: 

A 2 - $1 A — 1 = 0. 



This equation always has two real roots A2 and A3 since its discriminant is positive: 
D = ($i) 2 + 4 > 0. Due to the Viet theorem we have A2 A 3 = —1. Due to the same 
reasons as above in the case of Ai, for A2 and A3 we get | A2 1 = | A3 1 = 1. Hence, 
one of these two real numbers is equal to 1 and the other is equal to —1. Thus, 
we have proved that the number A = 1 is among the eigenvalues of the operator /. 
The theorem is proved. □ 
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Theorem 4.5. In a three-dimensional Euclidean vector space V for any ortho- 
gonal operator f with determinant 1 there is an orthonormal basis in which the 
matrix of f has the form (4.6). 

PROOF. Under the assumptions of theorem 4.5 the operator / has an eigenvalue 
Ai = 1. Let ei ^ be an eigenvector of this operator associated with the 
eigenvalue Xi — 1. Let's denote by U the span of the eigenvector e! and consider 
its orthogonal complement U ± . This is the two-dimensional subspacc in the three- 
dimensional space V. This subspace is invariant under the action of /. Indeed, 
from x e U± we derive (x|ei) = 0. Let's write the isometry condition (4.1) for 
the vectors x and y = ei: 

0=(x|ei) = (/(x)|/(ei)) = Ai (/(x)|ei). 

Since Ai = 1, we get (/(x)|ei) = 0. Hence, /(x) e U±, which proves the 
invariance of the subspace U ± . 

Let's consider the restriction of the operator / to the invariant subspace 
U ± . This restriction is an orthogonal operator in two-dimensional space U ± , its 
determinant being equal to 1. Therefore, in some orthogonal basis e 2 , e 3 of U ± 
the matrix of the restricted operator has the form (4.5). 

Remember that ei is perpendicular to e 2 and e3. It can be normalized to the 
unit length. Then three vectors ei, e 2 , e3 form an orthonormal basis in three- 
dimensional space V and the matrix of / in this basis has the form (4.6). The 
theorem is proved. □ 

The result of this theorem is that any orthogonal operator / with determinant 
1 in a three-dimensional Euclidean vector space V is an operator of rotation. 
The eigenvector ei associated with the eigenvalue Ai = 1 determines the axis of 
rotation, while the real parameter ip in the matrix (4.6) determines the angle of 
such rotation. 
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CHAPTER VI 
AFFINE SPACES. 



§ 1. Points and parallel translations. Affine spaces. 

Let M be an arbitrary set. A transformation of the set M is a bijective 
mapping p : M — > M of the set M onto itself. 

Definition 1.1. Let V be a linear vector space. We say that an action of V 
on a set A'/ is defined if each vector v € V is associated with some transformation 
p v of the set M and the following conditions are fulfilled: 

(1) po = id M ; 

(2) p v+w =p v «p w for all v, w e V. 

From the properties (1) and (2) of an action of a space V on a set M one can 
easily derive the following two properties of such action: 

(3) p_ v = Pv 1 f° r au v £^; 

(4) p v °Pw = Pw °Pv for all v, w g V". 

Definition 1.2. An action of a vector space V on a set M is called a transitive 
action if for any two elements A, B g M there is a vector v g V such that 
p v (^4) = -B, i- e. the transformation p v takes A to £>. 

Definition 1.3. An action of a vector space V on a set M is called a /ree 
action if for any element ieM the equality p v (A) = A implies v = 0. 

Definition 1.4. A set M is called an affine space over the field K if there is a 
free transitive action of some linear vector space V over the field IK on M. 

Due to this definition any affine space M is associated with some linear vector 
space V. Therefore an affine space M is often denoted as a pair (M, V). 

Elements of an affine space are used to be called points. We shall denote them 
by capital letters A, B, C, etc. An affine space itself is sometimes called a point 
space. A transformation p v given by a vector v g V is called a parallel translation 
in an affine space M. 

Let U be a subspace in V. Let's choose a point A <E M and then let's define a 
subset LcMin the following way: 

L = {B g M : 3 u ((u g 17) & (B = p u (A)))}. (1.1) 

A subset L of M determined according to (1.1) is called a linear submanifold of 
an affine space M. Thereby the subspace U C V is called the directing subspace of 
a linear submanifold i. The dimension of the directing subspace in (1.1) is taken 
for the dimension of a linear submanifold L. One-dimensional linear submanifolds 
are called straight lines; two-dimensional submanifolds are called planes. If the 
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dimension of U is less by one than the dimension of V, i. e. if dim(V/U) = 1, then 
the corresponding linear submanifold L is called a hyperplane. Linear submanifolds 
of other intermediate dimensions have no special titles. 

Let U = (a) be a one-dimensional subspacc in V. Then any vector u g U is 
presented as u = t ■ a, where igX. Upon choosing a point 4eM the subspace 
J7 determines the straight line in M passing through the point A. An arbitrary 
point A(t) of this straight is given by the formula: 

A(t)=pUA). (1.2) 

The formula (1.2) is known as a parametric equation of a straight line in an afhnc 
space, the vector a is called a directing vector, while t g IK is a parameter. 

If K = R, we can consider the set of points on the straight line (1.2) corre- 
sponding to the values of t taken from the interval [0, 1] C R. Such set is called a 
segment of a straight line. The points A = A(0) and B = A(l) are ending points of 
this segment. One can choose a direction on the segment AB by saying that one 
of the ending points is the beginning of the segment and the other is the end of the 
segment. A segment AB with a fixed direction on it is called a directed segment or 
an arrowhead segment. Two arrowhead segments AB and BA are assumed to be 
distinct 1 . 

Let A and B be two points of an affine space M. Due to the transitivity of the 
action of V on M there exists a vector v g V that defines the parallel translation 
p v taking the point A to the point B: p v (A) = B. Let's prove that such parallel 
translation is unique. If p w is another parallel translation such that p-w(A) = B, 
then for the parallel translation p w -v we have 

Pw-v(^) = P-voPw(A) = p-^pwiA)) = p~\B) = A. 

Since V acts freely on M (see definition 1.3), we have w v = 0. Hence, w = v, 
this proves the uniqueness of the vector v determined by the condition p v (A) = B. 

The above fact appears to be very useful: if we have an affine space (M, V") , 
then vectors of V can be represented by arrowhead segments in M. Each pair of 
points A,B g M specifies the unique vector a g V such that p a (A) = B. This 
vector can be used as a directing vector of the straight line (1.2) passing through 
the points A and B. The arrowhead segment with the beginning at the point 
A and with the end at the point B is called the geometric representation of the 
vector a. It is denoted AB 

A vector a is uniquely determined by its geometric representation AB . How- 
ever, a vector a can have several geometric representations. Indeed, if we choose a 
point C ^ A, we can determine the point D = p a (C) and then we can construct 
the geometric representation CD for the vector a. The points A and C specify 
a parallel translation j»b such that Ph(A) = C. Using the property (4) of parallel 
translations, it is easy to find that the parallel translation pb maps the segment 
AB to the segment CD. So we conclude: various geometric representations of a 
vector a arc related to each other by means of parallel translations. Note that 
Pa.+h(A) = D. Therefore, AD is a geometric realization of the vector a + b. From 

1 If K ^ R, an arrowhead segment AB is assumed to be consisting on two points A and B 
only, it has no interior at all. 
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this fact we easily derive the well-known rules for vector addition: the triangle rule 
AC + CD = AD and the parallelogram rule AB + AC = AD. 

Let O be some fixed point of an affine space M. Let's call it the origin. Then 
any point A £ M specifies the arrowhead OA which is identified with the unique 
vector r e V by means of the equality p r (0) — A. This vector r = r A is called 
the radius-vector of the point A. If the space V is finite-dimensional, then we can 
choose a basis ei, . . . , e„ and then can expand the radius- vectors of all points 
A e M in this basis. 

Definition 1.5. A frame or a coordinate system in an affine space M is a pair 
consisting of a point O £ M and a basis ei, ... , e n in V. The coordinates of 
the radius-vector r A = OA in the basis Gi , ... , G n are called the coordinates of a 
point A in the coordinate system O, ei, . . . , e„. 

Coordinate systems in affine spaces play the same role as bases in linear vector 
spaces. Let O, ei, ... , e„ and O', §1, . . . , e„ be two coordinate systems in an 
affine space M. The relation of the bases ei, ... , e„ and §1, ... , e„ is given 
by the direct and inverse transition matrices S and T. The points O and O' 
determine the arrowhead segment 00' and the opposite arrowhead segment O'O . 
They are associated with two vectors p, p € 



p = 00' p = O'O 

Let's expand p in the basis ei, ... , e„ and p in the basis ei, 

p = p 1 ■ ei + . . . + p n ■ e„, 



(1.3) 

p = p ■ ei + . . . + p" • e„. 



Then consider a point X <G M. The following formulas are obvious: 



OX = OO' + O'X , O'X = O'O + ox , 

By means of them we can find the relation of the coordinates of a point X in two 
different coordinate systems O, ei, ... , e„ and O', §i, . . . , e„: 

n n 

x l = p l + ^S)x\ x l = + Y, T jX 3 - (1-4) 

Though the vectors p and p differ only in sign (p = —p), their coordinates in 
formulas (1.4) are much more different: 



j=i j=i 

This happens because p and p are expanded in two different bases (see the above 
expansions (1.3)). 

The facts from the theory of affine spaces, which we stated above, show that 
considering affine spaces is a proper way for geomctrization of the linear algebra. 
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A vector is an algebraic object: we can add vectors, we can multiply them by 
numbers, and we can form linear combinations of them. In affine space the 
concept of a point becomes paramount. Points form straight lines, planes, and 
their multidimensional generalizations — linear submanifolds. In affine spaces we 
have a quite natural concept of parallel translations and, hence, we can define the 
concept of parallelism for linear submanifolds. The geometry of two-dimensional 
affine spaces is called the planimetry, the geometry of three-dimensional affine 
spaces is called the stereometry. Affine spaces of higher dimensions are studied by 
a geometrical discipline which is called the multidimensional geometry. 

§ 2. Euclidean point spaces. 
Quadrics in a Euclidean space. 

Definition 2.1. An affine space (M, V) over the field of real numbers K is 
called a Euclidean point space if the space V acting on M by parallel translations 
is equipped with a structure of a Euclidean vector space, i. e. if in V some positive 
quadratic form g is fixed. 

In affine spaces, which we considered in previous section, a very important 
feature was lacking: there was no concept of a length and there was no concept of 
an angle. The structure of a Euclidean space given by a quadratic form g brings 
this lacking feature in. Let A and B be two points of a Euclidean point space 
M. They determine a vector v e V specified by the condition p v (A) = B (this 
vector is identified with the arrowhead segment AB ) . The norm of the vector v 
determined by the quadratic form g is called the length of the segment AB or the 
distance between two points A and B: \AB\ = |v| = \Jg{v). Due to the equality 
| - v| = |v| we derive \AB\ = \BA\. 

Let AB and CD be two arrowhead segments in a Euclidean point space. They 
are geometric representations of two vectors v and w of V. The angle between 
AB and CD by definition is the angle between vectors v and w determined by 
the formula (1.6) of Chapter V. 

Definition 2.2. A coordinate system O, ei, . . . , e„ in a finite-dimensional 
Euclidean point space {M, V, g) is called a rectangular Cartesian coordinate system 
in M if ei, . . . , e„ is an orthonormal basis of the Euclidean vector space (V,g). 

Definition 2.3. A quadric in a Euclidean point space M is a set of points in 
M whose coordinates x 1 , ... , i n in some rectangular Cartesian coordinate system 
O, ei, ... , e„ satisfies some polynomial equation of degree two: 

n n n 

^^o y x l x 3 + 2 '^b l x i + c = 0. (2.1) 

i— 1 j — 1 i— 1 

The definition of a quadric is not coordinate-free. It is formulated in terms of 
some rectangular Cartesian coordinate system O, ei, ... , e„. However, passing to 
another Cartesian coordinate system is equivalent to a linear change of variables 
in the equation (2.1) (see formulas (1.4)). Such a change of variables changes 
the coefficients of the polynomial in (2.1), but it does not change the structure 
of this equation in whole. A quadric continues to be a quadric in any Cartesian 
coordinate system. 
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Let O' , ei, ... , e„ be some other rectangular Cartesian coordinate system in 
M. Let's consider the passage from O, ei, . . . , e„ to O', ei, ... , e„. In this 
case transition matrices S and T in (1.4) appear to be orthogonal matrices (see 
formulas (1.13) in Chapter V. We can calculate the coefficients of the equation of 
quadric in the new coordinate system. Substituting (1.4) into (2.1), we get 

n n 

i=l .7=1 

n n n 

^ = E 6 '' S t-EE"^'' S t- (2.3) 

Z— 1 i-1 j-1 

n n n 

c = ^ ^ a y - p J + ^ h p l + c. (2.4) 

i— 1 j — 1 i— 1 

Now the problem of bringing the equation of a quadric to a canonic form is 
formulated as the problem of finding a proper rectangular Cartesian coordinate 
system in which the equation (2.1) has the most simple canonic form. 

The formula (2.2) coincides with the transformation formula for the components 
of a quadratic form under a change of basis (sec (1.11) in Chapter IV). Hence, 
we conclude that each quadric in M is associated with some quadratic form in 
V. The form a determined by the matrix in the basis ei, . . . , e n is called the 
primary quadratic form of a quadratic (2.1). 

Let's consider the associated operator f a determined by the primary quadratic 
form a (see formula (3.9) in Chapter V). The operator f a is a selfadjoint operator 
in 7; it determines the expansion of the space V into the direct sum of two 
mutually orthogonal subspaces Kcr/ a and Im/ a : 

K = Ker/ a eIm/ a (2.5) 

(see (3.14) in Chapter V). The matrix of the operator f a is given by the formula 

n 

Fi = Y,9 ik a kj , (2.6) 
fe=i 

where g lk is the matrix inverse to the Gram matrix of the basis ei, ... , e„. Apart 
from f a , we define a vector b through its coordinates given by formula 

n 

b i = J2g ik b k . (2.7) 

k=l 

The definition of b through its coordinates (2.7) is essentially bound to the 
coordinate system O, e\, . . . , e„. This is because the formula (2.3) differs from 
the standard transformation formula for the coordinates of a covector under a 
change of basis (see (2.4) in Chapter III). Let's rewrite (2.3) in the following form: 

n / n \ 

i=l \ .7=1 / 
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Then let's consider the expansion of the vector b into the sum of two vectors 
b = b^ + b^ 2 ' according to the expansion (2.5) of the space V. This expansion 
induces the expansion bi = b^p + b[ 2 \ where are transformed as follows: 

n 

^=£^. (2.9) 

i=l 

The vector b^ 2 ) in the expansion b = feW + M 2 ) can be annihilated at the expense 
of proper choice of the coordinate system. Let's determine the vector p = OO' 
from the equality b^ 2 ) = —f a (p)- Though it is not unique, the vector p satisfying 
this equality does exist since b( 2 ) e Im/ a . For its components we have 

n 

^ 2) +E^V=°< (2.10) 

3=1 

this follows from b^ 2 ) = -f a (p) due to (2.6) and (2.7). Substituting (2.10) into 
(2.8), we get the following equalities in the new coordinate system: 

b< 2 )=o, b = b«. 

The relationships (2.9) show that the numbers 6^ cannot be annihilated (unless 
they are equal to zero from the very beginning). These numbers determine the 
vector b^ e Ker f a which does not depend on the choice of a coordinate system. 
As a result we have proved the following theorem. 

Theorem 2.1. Any quadric in a Euclidean point space (M,V,g) is associated 
with some selfadjoint operator f and some vector b e Ker / such that in some 
rectangular Cartesian coordinate system the radius vector r of an arbitrary point 
of this quadric satisfies the following equation: 

(/(r)|r) + 2(b|r) + c = 0. (2.11) 

The operator / determines the leading part of the equation (2.11). By means 
of this operator we subdivide all quadrics into two basic types: 

(1) non- degenerate quadrics, when Ker / = {0}; 

(2) degenerate quadrics, when Ker / ^ {0}. 

For non-degenerate quadrics the vector b in (2.11) is equal to zero. Therefore, 
non-degenerate quadrics are subdivided into three types: 

(1) elliptic type, when c ^ and the quadratic form a(x) = (/(x) | x) is positive 
or negative, i. e. can be made positive by changing the sign of /; 

(2) hyperbolic type, when c ^ and the quadratic form o(x) = (/(x) | x) is not 
sign-definite, i. e. its signature has both pluses and minuses; 

(3) conic type, when c = 0. 

Degenerate quadrics are subdivided into two types: 

(1) parabolic type, when dim Ker / = 1 and 6^0; 

(2) cylindric type, when dim Ker / > 1 or b = 0. 
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The equation (2.1) in the case of a non-degenerate quadric of elliptic type can 
be brought to the following canonic form: 

(ar) 2 (a„) 2 

This is the canonic equation of a non-degenerate quadric of hyperbolic type: 

{x 1 ? (x n ) 2 _ 

(ai) 2 (a„) 2 

The canonic equation of a non-degenerate quadric of conic type is homogeneous: 

0* 1 ) 2 i | (Q 2 
(ai) 2 '■' K) 2 

The equation (2.1) in the case of a degenerate quadric of parabolic type can 
be brought to the following canonic form: 

(g^±...± (g^)! =2ie n 

(ai) 2 (a„-i) 2 

If n = dimAf > 1, then in a canonic equation of a quadric of cylindric type 
there is no explicit entry of at least one variable. Therefore, we can reduce the 
dimension of the space M. The reduced quadric can belong to any one of the 
above four types. If it is again of cylindric type, then we can repeat the reduction 
procedure. This process can terminate in some intermediate dimension yielding 
the reduced quadric of some non-cylindric type. Otherwise we shall reach the 
dimension dimM = 1. In one-dimensional Euclidean point space there is no 
quadrics of cylindric type. Therefore, the quadrics of cylindric type are those 
which belong to one of the non-cylindric types in the reduced dimension. 
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